MASSACHUSETTS INSTITUTE OF TECHNOLOGY 
ARTIFICIAL INTELLIGENCE LABORATORY 


AT. T.R. No. 12fiCI 


October lOt)0 


COMPUTATIONAL STRUCTURE OF HUMAN 

LANGUAGE 
Eric Sven Ristad 


"Lie cKiiiril thesis of this report is that human language is PfP-comp lets. That is, 
tbe proves* of comprehending and producing utterances is bounded above bv the 
eJass.V P, and below by NP-hwdness. This c«iatt uc ti ve comp Irjciiy thesis ha? two 
empirical consequences. The first K uj predict that * liaguistk theory outside A'> 
is unnaturally powerful The serein d is lo predict that a linguistic tbeoty easier 
ttiam NP-bard ii descriptively Inadequate 

To ptove the lower bound. I afro* that tbe JMlowmg three subprablems of language 
comprehension arc all NP-hatd: decide whether n given sound is possible sound of 
a given language; disambiguat* * wquenee of words; and compute the antecedents 
of pronouns. The prods are based directly on the empirical fads of the language 
liters knowledge, under an appropriate idealization Therefore, Lhry are invariant 
across linguistic theories, {for this reason, no knowledge of linguistic theory is 
needed to undocstand tlie proofs, only knowledge of English.) 

Tu illustrate the usefulness of the upper bound;, f show that two widely-accepted: 
acaLyNMs of the language user's knowledge {of syntactic ellipsis and phonological 
aependenotes) lead to complexity outside of .VP (PSPAC&bard and Undecidabk. 
respectively), N-catt, guided by the complexity proofs. I construct alternate linguistic 
analyse* that are strictly superior on descriptive grounds, as well as being Less 
complex computationally {in tfvy 

The report also presents a new framework f or linguistic theorizing, that resolves im¬ 
portant puizles in generative linguistics, and guides tbe mathematic! investigation 
oi Ii nman language. 


A. rlcrmwlodg'ninout*: 

My debt is (g my tewhftnt, lo whom this report Lb dedicated; to Robert 

C- UpTwieik, who got me xturLed in Lhis work.. CSrfifilLK 1 read ,ong dTafts OT 1 short 
notice, and ihepardtd ::l« through MIT Lilt* tllfi Faculty job oF my choice; to Noam 
Chomsky, who awakened mo to the i ulhless pursuit of Ta-ilonaJ thought and empir¬ 
ical truth, was always wiLLing to argue with me, and iias single-handedly dragged 
the study of thought and power out of the sloiigh ofidiocy which appears to be its 
naiiiml resting place; to John Cooke, who wan- my Fust mentor:, Id Morris Halle, 
who taught rue all Lha phonology 1 know-, including .-tij.sdi.il sessions at ao c.uu 
coat, and always tried to understand what t couldn't always express; and to Jc-rma 
Rissanare, who graciously taught™ the philosophy and practice of statistical mod¬ 
elling with the minimum description length principle, alt In one sumnrer 

The Ideas, nf the report Pid-st only due to the help end intellectual company nl" 
Sandiway Fong,lames Higginbotham, Howard Laanit, and Alee MarantE. 

A special thank you to my thes is readers Peter Ellas and Michael Sipser for their 
help; to Eric Grimaon for being a peerless academic advisor; and to Patrick Winston 
fur clarifying rriv |><:iills and helping me ■::■ i : l ;n limes of distje.--. 

Stephen Anderson. Vlviane Deprea. Ken HaLe, Hilda Koopman. and Peter Sells 
have shared their Invaluable knowledge oF Lhe world's pronominal systems with me. 

For their assistance, thanks to Esra Black Pimxlilcu ■Cs.url, Faith Frlek, Alexandra 
G Iorg], Marla-Therasa GuastL. Roil Greenberg, Hay Hin-ebfeld, Morbert Horns Lei r.. 
Dave Hirsh, Richard Larson. Tomas Louano-Psrei, Scott Meredith, Albert Meyer, 
Guo IT Pullum. Tanya Rein ha rt-, Sally Richter, Giorgio Salt a, Barry Scheiu, Deb 
Sterling, Chris Tancredi, Esther Torre go, Amy tt'cmb«Tg. nod Ken Weskr. 

The author hws been supported by a generous four year .graduate fellowship frum the 
IBM Corporation, for which he is extremely grateful. This report describes- research 
done in part at Lhe Artificial Intelligence Laboratory of the Mwaobuetta Institute 
oFTcchnoiogy. SupjjurL f:,'r lIih [.ah'.iraloi'y's artificial intelligence Tesenj-rh hw been 
provided in part by a grant ftorn the Kapni Family Fuinidation, In part by JVSF 
Grant DC R-85552!>43 under a Presidential Young IsLCSStigator Award to Professor 
Robert C Berwick, and in part by the Advanced. Research Projects Agency nf the 
Department of Defense under Office of Mansi Research contract NdQQH-f&>K-01?d, 

This report is a slightly revised version of a thesis submitted to Lhe MIT Department 
of Electrical Engineering a:id Computer Science m May J99G. In partial fiillfiRmon'l 
of the requirements For the degree of Doc tor of PhlLosophy. The major additions; 
are a new lower bound proof for the anaphora problem in section 4.1, and the 
phiLosphical discuxxLnri of append ire A-l 


©M ;l!---ui::i i.srLLs Institute nf TechrKiLugy, 193ft 


j 


Prefacer Mathematical Analysis 
in the Natural Sciences 


The purpose of this report is to rducidate the structure of human language, in. 
terms of the mathematics of computation and information. Its central thesis 
is that human language, the process of constructing structural descriptions, 
is NP-coinplete. The body of the report :s devoted to defending this point. 

Any such mathematical investigation of what js fundamentally a topic in 
the natural Sciences must he both relevant and rigorous. It must simulta¬ 
neously satisfy the seemingly incompatible standards of mathematical rigoi 
and empirical adequacy. Of the two criteria, relevance is the more impor¬ 
tant. and judging by historical example, also the more dlfFicuU to a at iffy. 
If an Investigation is noL rigorous, It may be called into question and the 
author perhaps, asked to account for Us lack of rigor; but if it is not relevant, 
it will be ignored entirely, properly dismissed as inappropriate mathematics. 

In order to ensure relevance, such an investigation must demonstrate a com¬ 
prehensive understanding of the natural science {linguistics. In this cane) 
and provide two warrants. 

The first warrant is a conceptual framework for the investigation, so that 
the mathematics is used to answer relevant questions. The framework must 
include a technique for performing the analysis in the domain of the chosen 
natural science. This technique must ensure that the insights of the science 
arc preserved, and must he sufficiently general So that others can extend the 
investigation. 

The second warrant is a contribution to Lbe natural science itself. Such a 
contribution might take the form of a simply-stated mathematical thesis that 
is an independent guide to scientific investigation in the chosen domain. To 
he useful, this thesis must make Strong predictions that are easily falsified ip 
principle, but repeatedly confirmed in practice. Only under these conditions 
is it possible to develop confidence in such a thesis. 

An exemplary mathematical investigation into human Eangutage may be 
found in the work of Noam Chomsky. Iai Three Model* for ihe De-$Crip- 
iicm of I-otttfufigr, Chomsky (19-5G) defined a framework within which to 
examine the empirical adequacy of forma! grammars. He posed the follow- 




in 5 ques-Lions: Do human languages require linguistic descriptions that are 
onl-stde tire range of possible descriptions? Can reasonably simple grammars 
be constructed for ail human Languages? Are such grammars revealing, in 
that they support semantic analysis ar.d provide insights into the use and 
understanding of language? This is the first warrant. 

The article also provided the second warrant, in the form of a simply-stated 
complexity thesis, namely that human language has a finite- but no finite- 
state—characterization, and that the simplest and most revealing charac¬ 
terization is given by a class of unrestricted rewriting systems. The es¬ 
say demonstrates an unqualified commitment to understanding human Ian- 
guage, See, by way of contrast, Curry (1981), Lambek flMl},. Peters and 
Ritchie {1973), and Plate* and S^all (10TB). 

In this report, 1 argue that Language is the process of constructing linguistic 
representations from extra-linguistic evidence, and that this process is. MP- 
eomplete, This bring* Chomsky’s 1956 complexity thesis up-to-date, with 
the advantages of a better understanding of language {the result of thirty 
years of productive research in linguistics} and a more precise theory of 
structural comp lustily, based on computational resources rather than on the 
format of grammars or automata. The resulting thesis is aJso much stronger, 
providing tight upper arnl lower bounds, and therefor# is a truly canslridcritifi 
complexity thesis for human Language. 

The complexity thesis is defended with a novel technique, called the direct 
analysis, that may bo contrasted to prior analyses, which have all been, in¬ 
direct, An indinetl anaiyai$ is an analysis of a formal system within which 
linguistic knowledge may bo represented, a kind of programming language 
fur natural Language processing. Indirect proofs are based on the ad-hoc 
particulars of the formal system, and only very tenuously (jf at sJ]} on em¬ 
pirical facts. Indirect analyses of human Language may be found in Peler-s 
and Ritchie (1973) and Barton, Berwick and Rrntad (1937). In contrast, a 
direct anstym is a mathematical analysis of linguistic knowledge itself. In it, 
proofs are based directly on the empirical facts of linguistic knowledge, and 
therefore are invariant with respect to our scientific ignorance. To the best 
of my knowledge, this report contains the first direct complexity analyses of 
human language-, which provides the first warrant. 

The complexity thesis makes strong prediction*, because many proposed 
Linguistic theories violate It, and because the .ViP iower hound i.s in sharp 
contrast to the prevailing belief that language in eftkient, which is held by 


ill 


many linguists psycholinguists, and computational linguists. In the body of 
the report, | prove the lover bound in three distinct domains, using direct 
complexity analyses. 1 demonstrate the utility of the A f V upper bound by 
using it to guide the revision of the segmental theory of phonology, and 
of the capy-ajKi-link theory of syntactic cUip^is. This provides the second 
valiant. 


LY 


Contents 


1 Foundation of the Investigation 2 

1-1 The conceptual framework ... 3 

1.2 The constructive comp lexi ty thesis; & 

1- 3 Direct complexity analy-sis. ... 10 

1.4 Summary of the report „ ... jj 

- Structure of PEionoiogi-cal Knowledge 14 

2- 1 Segmental phonology ..... . , 16 

2.1.1 Complexity of segmental recognition and generation. . la 

2.1.2 Ftes-tricting the segmental model ... So 

2.1,-3 The yPE evaluation metric ... , 

2.1.4 Modeling pELo-EoiogLcal dependencies .. 33 

2-2 Autosegmentaj phonology . ... . . ... 34 

2.2.1 Compluxi ly of autosegmentai recognition ....... 35. 

2.2.2 Supt&segmentaj? dependencies ..... ........ . 38 

3 Syntactic Agree roe rii and Lexical Ambiguity 41 

3.1 Morpbo-syntactic dependencies ........ . . 43 

3.2 Complexity of Linguistic transforms , . , ... . . 53 


3-3 Complexity of agreement interactions 55 









3.4 ConeEusions .. (W 

3-4.1 Locality in linguistic theory - ,.- GO 

3.4.2 The search for uniform mechanisms ^3 

4 Complexity of Anaphora Sfi 

4.1 Two proofs of the-VP lower bound ............... 98 

4.1.1 Agreement reconsidered ................. Ti 

4.1.2 Relations of referential dependence . . . . .. 73 

4.1.3 From satisfLabilitytO referential dependence ...... SO 

4.2 Evidence for an A i ’j p upper hound .. 84 

4.2.1 Simple thoory of ellipsis .. 85 

4.2.2 Complexity outside Af'P ,. BS 

4.2.3 Ellipsis reconsidered ....- > 97 

4.3 Analysis of linguistic theories ... 192 

5 References 109 

A Fhikpsopiueai Issues 114 

A-l Implications of a complexity classification ........... 114 

A. 2 Unbounded idealizations ... . 11$ 

A.2,1 Unbounded agreement features .. 121 

A.,2.2 Unbounded phonological features .. 123 

A, 2.3 Limiting unbounded Idealizations 129 

B Structure of Elliptical Dependencies I2Q 

B. I Previous work reconsidered . 12S 

B. 1.1 Covariance reduced to predication of subjects.129 

B.I .2 Tile problem of nonsubject covariance ......... 130 

B.I.3 Covariance reduced to hound anaphora ........ 133 


■■ i 



















H -2 [^visible obviation . . ... , .. . 

J3-3 Necessary structure of an explicit theory . U] 

B4 TIlc proposed system of representilioji 142 

B-4J Sc.Ep^ate phrase suit! thematic structure 142 

B,4,2 Two relations of referential dependency 144 

B,4.3 Ellipsis as a shared thematic-function L _ J 4 § 

BS Tltd space ofeUipEjcsJ structures . . , r .. ,. . , V51 

B.5,1 Possihie domains of ellipsis , r „. 151 

B.5.2 Invisible crossover... 153 

B-5,3 Invisible obviation . , L . r , r , , . 154 

B-5-4 Ibecursive ellipsis 

B.6 Conclusion . * .. .* r .. . . „ . ..1J7 


1 











Chapter 1 

Foundation of the 
Investigation 


Fbe chief goal in the computational study of human language is to design 
a computational model of the language user. that explains tilt processes of 
comprehending, producing, and acquiring languages. The central obstacle 
that we encounter in such an investigation,, and in enginwring systems fnr 
language perception, h our inoompiete scieiitif.r understanding, In order to 
overcome this obstacle, we need an independent a priori criterion to guide 
the design of our language models and the revision of our linguistic theories. 

The goal of this report is to characterize the complexity of human language, 
using only empirical facts of linguistic knowledge, Such a complexly thesis, 
with tight upper and lower bounds, yields a useful design criterion. It ob¬ 
ject ively measures the significance of a- particular design decision, id terms 
of its cfleet on the complexity of the language model. It tells us when a 
language model is too restrictive: language models that do not satisfy the 
lower hound do not have an adequate account of complex Eir.guistic phe¬ 
nomena. Any particular des:gn decision or empirical generalisation may of 
course violate the lower bound, which is relevant only to the model in its 
entirety, A complexity thesis also tells us what is a reasonable empirical gen¬ 
eralization: (hit in the ahsenee of overwhelming Wunterevidence,, empirical 
generalisations must conform to the upper bound, 

Thia introduction establishes the foundation of the research, on width a 
constructive complexity thesis for human language is built. According to 


2 


Che guideline set forth in the preface, the chapter must accomplish three 
things; first, provide a. conceptual fra me work that posts relevant questions 
for mathematical analyses; second, introduce and motivate the central thesis 
of the report, that human language is MP-complete; and third, discuss the 
formal technique of ft direct complexity analysis, that will preserve the in¬ 
sights of linguistics in the details of the complexity analysis. Lei tie consider 
each in turn. 

I.i The conceptual framework 

What is language? According to generative linguistics, language is a cog¬ 
nitive system of knowledge. The generative .grammar of ft particular hu¬ 
man language enumerates rJ and only the possible (complete, grammatical) 
structural descriptions of that language, These structural descriptions are 
acquired on the basts of experience, and put to use in the production and 
comprehension of expressions. 

An important component, of linguistic knowledge is knowledge of Linguistic 
dependencies, that is, how the parts of a linguistic form depend on each 
other. For example, speakers of English know that the subject and main verb 
of the matrix clause agree with each other; knowing that the subject is plural 
predicts that the main verb is also marked plural, even though this- marking 
may not be overt. Speakers of some English dialects also- know that Voiced 
consonants are Immediately preceded by long vowels; that /t/ and /d/ are 
both pronoun cod as- the voiced flap [D] aftor a stressed, vowel, jf another vowel 
fallows; and that this voicing depends on the vowel IcngtlLciiing process, 
This knowledge of phonological dependencies explains why pairs of words 
like spider,rioVr and latlrr 9 iaddcr, are distinguished phoiCtkallly only by the 
Length of their first vowel, while related pairs such as writs Y ride maintain the 
underlying voicing distinction between jij and /d/- From an information- 
theoretic perspective, then, a generative grammar provides a constructive 
characterization of the informational dependencies in the surface forms (ic., 
the expressions] of a particular language. 

The generfttiVo framework poses a number of conceptual puzzles, that must 
be resolved in order to understand what language is, 

* The fim putale is the relation between production and comprehension - 
Why people can learn to comprehend and speak the same language? 


And why can a. sound mean the same thing to the person who spoke 
it 4Jld tire perEon who heard itT Generative linguistic theory postu¬ 
lates that these two ^bililres, prc duet ion and COISlpfehentuon, share 
one compMlent; knowledge of language, But why should Broca's and 
Wernicke's Areas, the two distinct regions of the brain that seem to 
perform these tasks,, have the same knowledge: of language? And how 
can one ‘knowledge component 1 Solve the entirely different problems 
encountered in production and comprehension? In electronic conuilu- 
n:cation systems, Ike transmitter and receiver rarely even share hard¬ 
ware because they perform Such entirely different functions. They 
interact successfully only because the human designer intended Iherr. 
to. 

* A related puzzle is to explain comprehension independent of produc¬ 
tion, and vico-vfcrsa. That is, how cart wo explain what It- means to 
comprehend an utterance without direct access to the intentions of 
the producer? To comprehend an utter&JlCC cauEoi mean to find the 
exact structural description in Lhe producer^ head,, because thtet :s 
nover available. It cannot mean to find some structural description of 
the utterance, because this allows the null structural description as a 
trivial (nonj&nluticill, Nur can it mean to find ail possible structural 
descriptions for tire utterance, because this does not toll us which one 
is the “intended" structural description. 

* A third puzzle is how language can he at once so com pi ex and yet easy 
to use. Consider the sentence fzpccHcd to see Affw], The pronoun 
him cannot refer to Bill. But as a part of another sentence, f tuondcr 
!uho [Bit! expected to *ef fttm], it can. IdnguiSties tells ns that complex 
systems arc Heeded to describe this kind of complex phenomena*. Com¬ 
puter science tells ui that complex systems do not perform effortlessly. 
Yet language processing seems to be efficient. How can this he? This 
is Cordomoy’s paradox, named after the Cartesian linguist Geraud de 
Cordemoy who observed, “We can scarce believe, seeing the facility 
there is in Speakmg, that there should need m many parts to be acted 
for that purpose: But we must accustom ourselves by admiring the 
structure of our Dodv, to consider that 'tis made by an incomparable 
workman. Who is inimitable." (1667, pp.84-5) 

■ The fourth puzzle is how Is linguistic knowledge used in actual perfor¬ 
mance. People produce, comprehend, and acquire languages, These 


A 


are the empirical phenomena In need of scientific explanation. Gen¬ 
erative linguistic attempts to explain, these phenomena by postulat¬ 
ing theories of ljng.uisl.Lc knowledge. But how is this knowledge used? 
What is the exact relation between a theory of knowledge and a model 
of the language user? Lacking an ans-wer to this central scientific ques¬ 
tion , Ike theory is at heat incomplete; at worst, it is Incoherent. 

These puzzles are best resolved by the conceptual framework within which 
particular theories are to he proposed. A framework does not itself answer 
empirical questions. Rather, it explains how these questions will he answered 
by particular theories. 

The conceptual framework of this mathematical investigation ia may he sum¬ 
marised as follows, The language faculty is a mental organ that performs 
a computation, which we may call “language. 1 " Language is the procesE of 
constructing representations from evidence. Tke central questions that arise 
in suth an Approach to language are: what is evidence, what arc represen¬ 
tations, and what is the relation between representation and evidence? - 
adopt an information-theoretic perspective on these issues. 

Briefly, the language fatuity lies at the interface of several cognitive sys¬ 
tems. including the mental lexicon and motor, perceptual, and conceptual 
systems. The forms continually produced by these cognitive systems are the 
instantaneous evidence that the language faculty sees. Language, then, is 
the process of computing the informational dependencies among the codes 
continually produced by these cognitive systems- A generative grammar is 
an enumerative code for this instantaneous evidence. Linguistic representa¬ 
tion® are representations of these dependencies, and the structural descrip- 
i]oo constructed by the language faculty at a particular instant In time i& 
the erne that most reduces the apparent information in the instantaneous 
evidence, that is, the host description of the evidence. An overt expression, 
Whether spoken or written. Constitutes only a very small part of the total 
instantaneous evidence available to the language facility. The more addi- 
taoual evidence that the language faculty has at the moment of perception, 
the more complete the structural description of that evidence will fee. and 
the less random the utterance appears to the language faculty. 

This constriicrtve (as opposed to generative) explication of what language is 
explains how the preceding puzzles are to he solved: 

* Production artsl comprehension are the same process of representation 


tuns[ruction. The cent!al difference between the two is lit the direction 
of -d&t& dependencies^ an<J in. the distribution of the instantaneous 
evidence. For example, in production the perceptual system provides 
tess evidence, while in comprehension, it provides more. 1, 

* It defines cum prehension without reference to the producers inter)- 
tions, because comprehension and production always compute the best 
description cd the available evidence. When there- is Sufficient evidence 
available (from the cognitive model of the speaker* intentions, the 
perceptual system, and the priming of lexical entries, fur example), 
then the comprehender constructs the same representation that the 
producer intended. 

* A linguistic representation is the heSt description of the available evi¬ 
dence. The best description of incomplete or insufficient evidence is an 
incomplete Tepreset)Uttion, Therefore, the languitge facnltv need never 
perform a blind search that can lead to computational intractability. 
The language faculty dues not assign a complete representation to iro 
complete evidence, even though some complete representation might 
be consistent with the evidence. 3 

* The constructive framework also suggests a way to understand the 

" This framework lJujeCeie Kjikm tine Carleriari sejiiriiinn hetw«*eii lLc pnmi of 
Cun si rue line mental nrp.£*M«Satin as, and tJic ptlteptusl H'ideDce that jaslLft* a paj- 
ticiciiai menial reprwentfctfaa. Note, howevnr, that the Eraoiawcrl fail* t* pfftvidie u,y 
understanding Dr tfce creative aspects cf jiiDdiqctioi) that su conEErnad Cistiisians. 

' {ln this kr.L’iiri^r cinnos the computational moduJ? that Fbdcii (1083) iirgiiES 

il is, hnxiUiSc jt u not in f^irmatioiiall y encapsulated. Contra Fodot, tliii is *d.MLttfl.calIy 
detirahjt because iL is the only way \ Q explain hn-iv lutgug# comprehension is pnwLble 
it all La the fM6 id wild perceptual un-der-detEcrnLcatimi. [f ihij Wijj japul lo language 
CCiDi pteh.Ensian wa$ sensation, dt even an abstract .noise Tree cxpintsUfi, tLen Lire problem 
of constructing anything hie the intended rcpres-jatilior. wnuLd Jw ilL-poied in a p$[j- 
■iJ-TIS way. (The same is true far a ilaLemi’nl of (he laaguage production jj roll, era whom 
sole input *vas a representation of ‘n3EaEiajt T '.) As is Well-known, a system Consisting of 
oampiiL 4 (3cin.aJ modulcR is n««Hianljf iaelHirfant, both Computationally and slalisticaJly. 

! Rest rioting the amount of in for mat ion available to a 11101111,111!: trtukfl :u a computationil 
LneETiciency bec&m^ that module is 0 Ti 6-14 e to prun-e branches in jta oompulaljciri tree aa 
eiciy as it might dtjieiwisE be able- to, It Jesuits in a statistical iaefiicienry bfchllM a 
module might nc-c--.: 1,. eumne all available (videbce in aider to dnlar nhin e the optima] 
rati mate, cE. Wan and Kailsth. 1995.1 1 he COriBlrtaC'layv frararwurk auggr^pj aa answEr to 
(. ordepnoy S pat-idcx, namely, language u the process oE-efficiently const fueling the beat 
deprnpEior, of raff the available evident*, In short, the only moduLea in biiman language 
iifC street inou'/frfjc tlvadules. not compnl.aliori.kL modules of any kind. 



reLati.ansh.Lp between .generative theory and a computational model of 
the language user. A generative grammar Is a construct] ue theory of 
the informational dependencies in the extra-linguistic mental codes. 
As sudi, it enumerates the set of complete structural descriptions, 
and thereby provides a partial, extensions! characterization of the re¬ 
lation between extra-linguistic codes and their structural descriptions: 
^'extenstonal" because a generative grammar only describes the set of 
possible outputs, and “partiaF because this set is limited to COtnplttO 
structural descriptions of complete, noise-free expression?. For these 
reasons, a generative theory is a necessary first step in the design of 
an adequate constructive theory of human language. 

However, the theory of grammar does not specify the function to be 
computed by the laiLeuage model. For one, the input, to the two- com¬ 
putations is ILOt the same, The input to the language model is the 
set of codes produced by the other cognitive systems; the input to a 
generative grammar is an underlying form, which is the index of enu¬ 
meration. Nor ate the possible outputs of the two computations the 
same. The language model assigns a structural description to every 
input, and therefore the set of possible outputs must include partial 
descriptions, for inconclusive evidence; the generative grammar only 
enumerates complete structural descriptions (cf. Jadtobson and Halle, 
155(5). Grammar and language model also Specify different relations 
between a structure! description and the expression that is its overt 
terminal string yield, The generative grammar specifies a relation 
between complete structural descriptions and complete, noise-free ex¬ 
pressions. The language model specifies a relation between structural 
descriptions and extraiingiustic evidence; the relation between Struc¬ 
tural descriptions and their overt expressions is an almost inconsequen¬ 
tial subset of this relation, that also includes structural descriptions 
for incomplete or noisy expressions, 11 

"'Jur LIi!!?m iciMrs. ,i jsipspi firriil bi; i 1n.nc.u isrc made!, or even pari -f i larf.iingn 
me,del.. A l&.iigjifl.-r- m-ndcl in n ftmctw-Tl from tbe mnl ant means e-ridemeir:, which Jopks 
nuLlung tike an ahrlrarl string rJ terminal :,yraha£s. Ld ihc fcisl iLnictiirai dcsnrijiliciii of 
t-hxt evidence, A parser ls a func lie-a from a siring of Ltinmi-d eyitihois to ill at set d! stra-c 
tnral d^riptLOna w-fcujtie yield esJiaiisl* di-at ayrtibd-eltinft- F^i thia te*40lt It Li plauuble 
to Train:ai» that "the th-WTy of (jraratnaT. .J-pccLfins the ftiri-rLi::n to bn (ornpuccd tj (.bo 
pitMr^ (Berwick and Weinberg, 130-l:B21. Howei-cr, Ll ia ■CHi-clol Lo lealiie Lhal paisiag 
ft- alnift$L nn reialLDii to Lhe CDraprehersHcm, jitihIulIIuA, (ft MXiuiaiUoii ut .languages, an:; 
thereforri U of Little or nth semili Ik interest. 


7 



This is simply to s-ay. generative theory is do! a model of human lan¬ 
guage; at any level of abstraction, Rather, it is a model of linguistic 


Again, it is Importaot to stress that this is a framework cor addressing scien¬ 
tific question* and performing mathematical analysis, not a scientific theory 
icseLf. The substantive answers to these puzzles lie in tile next generation of 
constructive linguistic theories. 

However, for Our present purposes this constructive framework (that lan¬ 
guage is tELe process of constructing representations Of exlraiinguistic in puts) 

poses relevant questions for mathematical analysis. By equating compre¬ 
hension and production in a fundamental manner, ’.he framework; says that 
the language comprehension problem is a proxy for language as a whole. 
Therefore, we are assured, that the mathematical analysis of sijbprohkms of 
language comprehension will t>C relevant to language as a whole. 

To view human language from, an information-theoretic perspective as we 
have done here is not to say that Language is designed for communication, or 
even that it h primary use Is to communicate. Nor do we claim that Language 
is a general-purpose computer or that language is designed for compulation, 
when we analyze the computational structure of language, The goal of 
this research is to understand human language in its own terms; computer 
science and information theory provide useful technical metaphors. That is, 
important aspect* of human language receive ap insightful Interpretation in 
terms of compulaiion and information. 

H These pcnnU are subtle, and tjavf Ounfuscd tniny peordt:. No ksr. a scientist Lhan 
Mlir (JSSO] has COlifused LJie CHWTtpri-ellCe-pETfciTiTiacM: diuLinct.ftn pf generative 
liUKULOiM Vrith ieviils of computational i^tiicUcn. ttal, as we have seen, Hie ieEaLicm. 
ship- Iwtweeii cDrop-ccence &nJ pertramanot JS net cine of ihstE*cti<ra. Camptttnre and 
perfimniricc Ace aim ply entirely diffetenl dasH* af eanipulalicnis, both, af which may be 
described at diffeiert ievr:l* ef abnr*cLian. For ailernate interpTCtAOena of tfte rclecirnci 
beoraen generative grammar and human kpguffi, at odds with Lh* one praentfl bar* 
M4 Chomsky (J9U), Chomsky 5i abler H L 3B 21, Berwick and Weinbejc and’ 

Bftltd ud Berwick (iflfltl), 



1.2 The constructive complexity thesis 


The central thesis of this work is tliat hum-Mi language has the structure 
of sun NP-complete problem An NP-complete problem is h-a-r^ to solve 
because the input Co the problem is missing some crucial information (the 
tfikkrtl witness]. tilt NK4 the efficient witness (the solution} is found, it is 
easily verified to be correct, As stressed by the great uiueteeuth century lin¬ 
guist Wilhelm von Humboldt, every sound uttered as language is assigned a 
■complete meaning and linguistic representation in the mind of the producer, 7 
It ]£ the task of comprehension to find tile intended representation, given 
only the utterance, When the utterance is missing crucial disambiguating 
information, and there are globsd dependencies In the structural description, 
then the task of finding the intended representation quickly becomes very 
difficult. (In effect, we are ahle to prove the lower bound, that language is 
NP-hard. because the amount of useful in formation in the evidence is not 
a parameter of current linguistic theories;, at! reductions below take advan¬ 
tage of this fact, which became obvious Otlly in retrospect.) Yet wo know 
comprehension cannot be too difficult;, simply because there is always an 
efficient witness, namely the linguistic representation from which the utter- 

*']'liLs. eomjitaxjty thesii is coostr-active because the opp-er abd tower bounds arc tight 
enough to tall as exactly where the adequate linguistic theoiied the, not only where they 
are noi. Huie “construed*-.-" means “-useful F It is IB. centrist to ChomKkj's rntn- 

plenty thesis, which is -not as useful because—as argued in Ch-Wlkfclsy (I OSS)-—tlta tipper 
bound LS ftiv Juuse. lire Lower bottlid is wc&V-. ind tlta format language Lheoiy of struc 
tUfil complexity IS not avifkjenilv pctcisc Tin- onnceptoal framework presented an ike 
preceding section is ooflatmeeive because it outlines * mode) of the language user. thaL 
would directly charaderise the computations performed in ike production. comprehen¬ 
sion,. and acquisition of human languages. There H ™nstrnctive h means “given expiidtly, 
hy conxlnictipn-'' [4 is tn contrast po the generative r-nniewark, which provides a partial, 
extension ai charx-cteri fati-ru, of pirsdurtiaii, com pie hens Lon, and acqaisitiun. 

a Aa should be clear (com the discussion., NP-CtittipleleniS*, is not the stigma that many 
apparently think it b. in (act. exactly the opposite is true, The thesw argues that more 
efficient Language models are CuzidanieotaLly inbdexiuntc, Wiring ot'-roiirsr a revolution in 
our understanding cl Language oc of nondctcnainisin- 

? “The sentence is not to be constructed. IS not te hit gradually built up ef component*, 
but is Lo Le exprer.so.1 all at on.ee in a form c car, pressed to unity. ... Mar. inwardly relates a 
complete neir.inz v,HLb every sound emitted as Language- that iS, f it him. it js a complete 
utterance. Man does not intentionally errut merely an iwjJULr-i! word, even though Ills 
statement accord inj to out viewpoint may only contain aoct an ruLitj-.^ (von Humboldt, 
IBJ9:1 IB-Ill) (The upper bound of Chomsky's 1955 ixrtnpleitity thesis, that language Las 
u finite description, also appears to bi: motivated by Sti obectvaljoR due to von Humboldt, 
r hat Ur pu rijj.r js the "infiajle use of finite nuac*} 


9 



ance was produced. If only the compTehender had the same evidence thaf 
the producer did. then lie would be abk L to efficiently compute the intended 
structural description,, because the producer did. 

The central empirical consequence of this thesis is that scientifically adequate 
language me dels must be NP-complete, under appropriate Idealizations (see 
appendix A,2). If a linguistic system is outside .Vp, say F S PACE-hard „ 
then the thesis predicts that the system is untllturally powerful, perhaps 
because it overgeneralizes from the empirical evidence or miMUlilyzes some 
Linguistic phenomena. Such a system must he capable of describing un¬ 
natural languages. If, however, a complete system is easier than NP-hard, 
and assuming V £ XV, then the system is predicted to be unnaturally 
wi?fck, most likely because it does not adequately account for some com¬ 
plex linguistic phenomena. Such a system will not be kbit to describe all 
human languages. Othcrwi.se the system is NT-complete and is potentially 
adequate, pending the outcome of more exacting tests of scientific adequacy. 

The- tlLesis is weakened if any of the formal arguments presented in this 
dissertation are refuted- It os falsified if either of Lis central predictions js 
falsified. That, is, someone must exhibit a comprehensive theory of human 
language and prove it to not have the structure of an NP-compile problem, 
or someone must exhibit some complex, linguistic phenomena and argue that 
its complexity is outside MV- 

This complexity thesis, then, is an independent guide to thv study of lan¬ 
guage. It is useful because it is &. simple decision procedure with which 
to evaluate linguistic, systems, both theoretical and and implemented. The 
Student of Language wj]J find it helpful, because, as this dissertation demon¬ 
strates, linguistic analyses ihxt have -complexity outside of m \fV are npe for 
reanalysis, 

1,3 Direct complexity analysis 


The logical next slep is to establish this NP-compluteneas th«li. The central 
technical obstacle encountered in this work, and in ail research on language, 
16 thq incomplete nature of our scientific understanding of language. For this 
reason, it is not dear how to precisely define any computational problem re¬ 
lated to human language at all. Our understanding of human language is 
neither comprehensive, detailed, nor stable, Any forma] model of language. 


10 


obtained perhaps by formaLizing some particular linguistic theory,, will be 
based as much on out scientific ignorance as on our understanding. Con¬ 
sequently, no meaningful or comprehensive formalization is possible, and 
any mathematical analysis of sudi a formal system would have little or no 
relevance to lan.fi.Eagc itself. 

To overcome this difficulty, we must seek Mi analysis that is invariant with 
respect to our igtiOiaElce, That way, future work may enrich our analysis, 
hut not falsify it. We may accomplish this objective with a direct analysis. 

A direct analysis ls a mathematical analysis that relies directly on well- 
uu dcrStOOrl empirical arguments about the language user’s knowledge of 
language in order to prove mathematical properti es of that knowledge, It 
does not rely on a complete formal model. In a direct analysis, "t use 
the scientific methods of linguistics to construct the simplest theory of a 
natural, well-understood da^S of linguistic knowledge, and then analyze the 
properties of this knowledge. We perform such direct analyses in section 2.2 
for the language user’s knowledge of suprasegmcntal phonological depen¬ 
dencies, and in chapter 4 for knowledge of referential dependencies. Slightly 
less direct analyses may be found in section 2.1 arid, chapter 3- 

Pfior mE.thema.tical analyses of language have all beta highly indirect, prov¬ 
ing properties of formal systems within which theories of linguistic knowl¬ 
edge may be represented, This is kkc analysing the properties of FORTRAN 
In order to bettor understand QUICKSORT, 

A related difficulty is th.lt it is Mt known how to define language compre¬ 
hension (LC) without reference to the producer's ifltflfflticma, As mentioned 
above, the real solution to tills difficulty must bo provided by the next gener¬ 
ation of linguistic theories. The temporary solution adopted here Is to select 
sub problems of LC that may be defined independent of the producer's in tril¬ 
lions, and are necessary suhprobleuiS of any reasonable constructive theory 
of comprehension. (Given the ^reasonableness” equivocation in the second 
clause, it is admittedly more a matter of art than science to de-fine such 
sub problems,} 


1A Summary tif the report 

The technical content of the report is apportioned into three chapters and 
two appendices: 


11 


Chapter 2 examines the computational structure of phonological dependen¬ 
cies. It begins by fislabUshiilg the undecidabibty of both generation and 
recognition problems for the scgmentai model of phonology. Guided by the 
complicity analysis, we propose a broad range of substantive restrictions, 
that result in a more natural segments] model whose generation and recog¬ 
nition problems are both in A P. Next, a direct analysis is provided for the 
complexity of suprasegmentsd dependencies. The chapter closes by reveal¬ 
ing the indispensable role uf a complexity thesis in the design of a theory of 
human knowledge, to guard against gross OvcrgeneralriaStons. 

Chapter 3 examines the LC problem in tbe domain of morphology and syn¬ 
tax- Wo demonstrate the art of choosing a si:bprobteiSl of language compre¬ 
hension that is relevant despite our inability to define the LC problem itself, 
and prove the chosen subproblnu to be NT-bard. The chapter concludes 
with a critjquc of the pursuit of uniform mechanisms in linguistic theory, 

Chapter 4 provides a direct analysis of the anaphora problem, which is to de¬ 
termine the intended Antecedents of anaphoric elements in & discourse. First 
we prove in two entirely diFarent ways that tlris LC subprobcerm i? NP-hard, 
based op empirical facts of the language user's knowledge of pronominal 
reference, such as why John saw him cannot mean "John saw John". Kext K 
w* show how it widely-accepted linguislrc theory of syntactic ellipsis makes 
the anaphora problem PSPACE-hard. Finally, guided by the complexity 
tlLdsis,. we falsify this Linguistic theory and sketch an empirically superior 
theory of ellipsis that reduces the complexity of anaphora to inside A'V. 
The conclusion to this chapter critiques an alternate approach to the math 
ernaticaJ investigation of language, based on the complexity analysis of the 
computational problems posed by linguistic theories, 

Appeadix A discusses two philosophical issues relevant lo the research re- 
potted here, first wc discuss the implications of complexity classifications 
for biological computations, and untangle the distinction between compe¬ 
tence and performance, Next we examine the de facto idealizations to un¬ 
bounded inputs and unbounded linguistic distinctions, that have beer, made 
implicitly in every serious theory of human language, as they must aiw&vjs 
be. 

Appendix K provides another warrant demanded in the preface, that the 
work make a contribution to the natural science itself. The contribution is 
an improved generative theory of syntactic ellipsis, with a discussion of a 
hitherto unnoticed phenomenon of invisible Obviation, with important itn- 


plications Jbf tile theory of anaphora. TIliS- appendix le an independent, 
Sioamallieiliatical cantribttllon to the field of lini^iaistLcs. 


Chapter 2 

Structure of Phonological 



I tie goal of this chapter is to elucidate the structure of phonological knowU 
vd.ge, from the perspectives of computer sconce and information theory. 

Linguistic sounds contain predict Able information, (-'or example, English 
speakers invariably aspirate a voiceless stop in the onset of syllable (as il¬ 
lustrated by minimal pairs such SLS [k*ftbKfgab| and [p^atj^fbat]) but only 
when t be onset is non bran chin g (compare [pSt]^(sp3t] and Ik^itM&kitf). 
The raised l h’ expresses aspiration, that is, that the segment is pronounced 
with a slight puff of air. Moreover, vowels are lengthened before voiced 
consonants (contrast the Articulations of r:ab'^cnp. bag^bacK, Mid so forth). 

Because these and other phonological dependencies are a part of the lan¬ 
guage user*s unconscious knowledge, they must be represented in ati ade¬ 
quate linguistic theory. Iji generative phonology, dependencies in the surface 
forms are encoded by a grammar of rewriting rules, using a dictionary of 
underlying forms. An underlying form represents the true, unpredictabie 
information content of a given surface form, It is constructed by combining 
(typically, with concatenation Of substitution) segmental sequences stored 
III the dictionary of morphemes. The grammar G derives a surface form f#] 
from its corresponding underlying form fuf in the dictionary D by repealed 
application of rules to the iulermediate forms i ]f rj,... of the derivation, AS 


14 


shown in 2-1: 


V /«/ € D 

G i fuf — /«i/ —* /■»/ —* — —*■ [■!] 


( 2 - 1 ) 


Each rule represents a natural clats of predictable phonological informa¬ 
tion. Continuing with, our example, the generative grammar of an English 
speaker must. contain rules, for aspiration wid lengthening. The aspiration 
rule rewrites the underlying form /tab/ a* the Intermediate form /Jr 1 -at/, 
adding the entirely predictable aspiration information to the voiceless stop k. 
Nest, the lengthening rule applies, reuniting the intermediate form /k^ab/ 
as the surface form [k*nb], 


According to generative phonology, the logical problem of language compre¬ 
hension Consists of finding a Structural description (that is. an underlying 
form and a derivation chain) for a given surface form- In effect, compre¬ 
hension is reduced to the problem of searching for the underlying form that 
generates a given surface form. When the Surface form does not transpar¬ 
ently identify its corresponding underlying form, when the space of possible 
underlying forms is large, or when the grammar is complex, then the logical 
problem of language comprehension can quickly become very difficult, 


The chapter is organised as follows- The next section introduces the segmen¬ 
tal model of phonology in some detail, discusses its computational complex¬ 
ity, and proves that even restricted segmental models arc extremely power¬ 
ful (undeeidable). Subsequently, we consider various proposed and plausible 
restrictions on the model, and conclude that plausibly restricted segmental 
models will be in A ‘V. Suction 2.2 introduces the modern autosegmental 
(nonlinear) model and discusses its computational complexity- We prove 
that the natural problem of constructing an autosegmcnluJ representation 
of an underspecified surface form is NT-hard. 

The centra] contributions of this chapter ate; (i) to analyze the corap U- 
rational complexity of generative phonological theory, as it has- developed 
over the- past thirty years, including segmental and autosegmental models; 
fti) to suggest a range of restrictions on the Kc-gmuntal model that reduce 
the complexity of the corresponding language model from nndecidable to 
inside A'P; (hi) to resolve some apparent mysteries regarding the £-FE eval¬ 
uation metric and the notion of a linguistically significant generalization; 
and (jv) to unify the description of supr&segmentai processes as establishing 
the SCgrnunlai domain within which one 'head 1 segment in the domain is 
phonetically distinguished from the nonhead segments in its domain. 


2,1 Segmental phonology 


Wc have seer, that phonological knowledge includes a grammar of rewriting 
iiues, to derive surface fomiB from utuferlying fcrms, The scientific questions 
that ari“5e ift f what L5 the class oi pertniss.bfe rewriting rule®, and. how are 
they applied ip the derivation? 

To summarire, the rewriting rales of the segmental phonology are unre¬ 
stricted and can. manipulate morpho-syntactk constituent structure. Rules 
are ordered Into a bLock and interact with each other. In a derivation the 
bloch is repeatedly applied,, from the innermost constituent Oat. Let as 
famine this system in more detail . 3 

Phonological features are abstract as compared with phonetic wp-rcspiita.- 
tions, although both are given in terms of phonetic features. The set of fea¬ 
tures Includes both phonological features, diacritics, and the distinguished 
feature segment that marks boundaries, Diacritic. features are associated 
tvitli lexical items as a whole; they control the application of rules. An 
example diacritic is ablaut, a feature that marks those stems that must 
undergo a change in vowel quality, such as, tOnsc-conditioned ablaut in Lhe 
English stn^, sung alternation. As noted in SPE, “technically speak¬ 

ing, the number of diacritic features should be at least as large as the number 
Of rules in the phonology. Hence, unless there is a bound OH the length of a, 
phonology, (he Set [of features] should he unlimited-"* (fn.l. p.^H) Features 
may be specified + or - or hv an integral value l,J,,, TT jV where A T is the 
maximal degree of differentiation permitted for any linguistic feature. The 
value of JV varies from language to language, because languages admit dif¬ 
ferent degrees of differentiation in Such features as, vowel height, stress, and 
tone. A set of feature specifications is called 4 unit or sometimes a segment. 
A string of units Ls called a matrix Or a seyrncfiio.! ifrtrtg, 

Suprasegmental relations are relations among segments, rather than, praner- 
tres of individual segments. For example, a syllable is a hierarchical relation 
between a sequence of segments (the nucleus of the syllable) and the less 

■Tit dtKiuismi iq Lius chapinr a I)iJe*l pibnixiJy on Ihfl theory presented. fey Noam 
CbwntJcy Uid Sintiia llalle (lSBfl) In fis? Sflund Pattern of Lnstish {SPE}. Thus mftn.uene.n- 
lal w«k defined, the field of ^eaeiaLLve pfeorn>ilc$)v by formaiiiEnfc ccntlil idea* in the field. 
indudinR Ihi iioOons ..’f depende-n cj. pioceis,. and linjuiHticillj^iigiiHhiBt gener*liiUio«. 

Chomsky ilid Halle cDnimue, J The;re u no point <vf principle Laval v«| here, and in 
flttapUfr eipauricm »%JilI.v we Khali atsumt the set to be Limited by an a prioti Condition. 
A eiiniilt comment applied to [tiie set liFsp-etifkiOoti]." 


Ifi 



SOBOians segments that immediately prattle .anti follow it {the onset and 
coda., respectively). 

Rewriting rules formaliie the notion of a phonological process., that L3. a 
representation for the dependencies implicit in. the surface matrices of a 
given language. An eietnfltilorj rule, is of the form ZXAYW —*■ 2/XBYW 
where A and B may be ^ or any unit. A ^ A' and Y may be matrices 
{strings of units), and Z and kF may be thought of a brackets labeled with 
syntactic categories such as ‘S’ Of L K : and so forth. 

Some phonological processes, such as the assimilation of voicing 8CM9S itior* 
phomc boundaries. aro very common across the world’s languages. Other 
processes, such as the arbitrary insertion of consonants or the substitution 
of one unit for another entirely distinct unit, are ex trembly rare or entirely 
unaitcstcd. For this reason, all adequate phonological theories must in¬ 
clude an explicit measure of the naturalness of a phonological process. A 
phonological theory must also define a criterion to decide what constitutes 
two Independent phonological processes and what constitutes a legitimate 
phonological general]nation- 

Two central hypotheses of segmental phonology are {5} that the most nat¬ 
ural grammars contain the fewest symbols and (ii) a set of rules represent 
independent phonological processes when they canned be more compactly 
combined rnlo a single complex rule {Kutlle 19b111962)* 

A complex hjjc, then, is a finite schema for generating a (potentially infinite] 
regular set of elementary rules, To a first approximation, the complex rules 
of SFE arc formcdi by combining units with the operations of union, con¬ 
catenation f Kleene star., and exponentiation, using variables whose values 
range over specifications, units, and syntactic categories. 3 ' Johnson (1972) 
considers a more powerful class of complex rules (and thereby, a different 
evaluation metric), an wail as alternative modes of rule application, 11 

models belong id the tinss nf r.ur.lacil rcwgitjng system* analysed by Post, 
rather lSoaq to the local tysleou dun to Thus. This is becfcUtt complex cu|«s* <:s.ri encode 
a firute n-jir.brir of nanlocaJ dependencies, ar.d hen-ce the MWlilin;j[ activity specified by a 
CL-ITj pis* rule can altect parts exf the currejii ded»4tKin StrluR iepwated by aji Arbitrary 

dirUirt, 

H Jn JLit.T,:S.>njiro-r-Mil, the empty .siring And cash unit are sehsmaLa; sohema nay be 
combined by the oci'T.u:on* of union, inlsrsecLLcn, negation, KlMBf Etif, 4n.d CKpOXiC'Ti- 
ualUm puer the »1 of anils. Johnson also vajj*bl« and BcoEsan conditions in his 
schemata, TTiJS “schema Lfln£fl*ge r is a extremely powarlul charaetecjiatJDn ul the cLaas 
of ieg.iiI,vr Ik: f nag; - rivet the Alphabet of units;, it Ls not used by practicing phenologists 


17 



The complex rules are organizes! into linear sequence ft 3 f ft Ip , - - R TI ; they 
are applied m OJrdflr to an underlying matrix to obtain a surface matrix, 
Ignoring a great many issues that ate important for Linguistic reasons hut 
Irrelevant for cnij purposes* we may think of the derivational process as 
follows. rSl-e ItLpal lo the derivation, or ‘'tlJldcrlying form;.'" is a bratvkftt'ed 
string of morphemes, the output of the syntax, The output of the derivation 
is the ’’surface form,” a string of phonetic units. The derivation consists of 
a serifts of cycles. On each cycle* the ordered sequence of rules is applied to 
overy maximal string of units containing no internal brackets, where each 
■fti+l applies I,Or doesn’t apply) to the result of applying the immediately 
preceding ruLe ft,-, and SO forth. Each complex rule R c itself generates a 
disjunctive sequence of elementary rules ft,,i, fti.j,.„, m order of increasing 
generality. 1 ' That is, li, generates a sequence of elementary rules where R. , 
precedes ft^ in the Sequence ifl the preconditions of suhsume the pre¬ 
conditions of the earliest that can apply to the current derivation 
matrix is applied to it, to the exclusjon of all other ft,^. Each elementary 
rule applies maximally to the current derivation string* that is, simultane¬ 
ously to all units in the string. For example, if we apply the rule A —*■ B to 
the string A A , the result is the string £fft. At the end of the ct^cle, the last 
rule ft r . erases the iiiitonnost brackets, and then the next cycle begins with 

Beraiuse a J5jV«a complex rule can tepteeent m infinite set q{ dement biy tmIh, Johnaor, 
ship's how the berated, exhaustive ipplkiLkm Diane complex [)h Ic a given nejtraeiiLal 
S-thiij ran "erect vj tvWnJLv my computable mapping," (p.lDj that jj, cun jimufoj* 

T,¥ ttn^pulaliwi in dpiily tfrie rrrp of the deriiviiiivit, Next, tic pr-nposee 1 tikjie rest-ricled 
^simatuneava'" mode pf ejij.itie-4tiDn far complex iwfiich La- capable of performing it 

most a fiui testate jna.ppi.nR in any alBgle application. The *fliBjiUicii™ ortaiing" mode of 
i'j]= appJtc&tion proposed ip 5F£ ia OBljr capable cJ performing ■ strictly finite joapciLr.^ 
any sirip.lr rule application.. This mode of application, which it vastly mere vomLiasned 
than eiLbei c-f Johnaon’a proposal*, is. id *0 the une used by ptididhg ptuHictepH*. Jji ihu 
chapter v»c coitiider the question of what computitaocs can be prtfotmeri by a finite set 
of detneaLary rules, *nd hence pro 'ride very looeti Lewer bounds foi J-tdinjon'i ejrr-cftsjrelj 1 
pameTfiiit model, We n<Hc in is a in 5, han-ever,. tliat the pioWem eJ dinptj determining 
whethri a *sv*a rule is subsumed by cme of jotiniao'a M-hMHa is i<*HF vildly intractable, 
mgniiLnj. fct leiil exponential rpiCt (The idea of tbe prcK>f tt- to ccnsLimet complex 
Mllftl: one Emeretee all passible iLriiif^t, and the Dther describe* the valid cpuspaLatioiu of 
t" Space, bounded DTM. The tatter rule miy lw constructed by nesting the cohut-rvclion 
due » HopcroFt and UUrnan 15T9:J50ff.> 

s Th* interpretation advatnled in SPE (p-SDOfT) LS (haL ike true ftrammai j L* the set, 
cf elementary rutei; the: value of Lhls grammar given by the evaluation metiic, ie the 
smalient number of p Hcsmc-lo^icdJ Cealures oF any *f t j' of tcmpLe* rnfoe that generitea (ke 
aet y. tl.aptfi S Kenatcn-ici and Kisevbvrth (IKSJ containi 4 leas technical summary 
*h* SPE system Uld a disLUJiiao of *ub*equent inDdLlicaLlrn* ap,j emendations to it. 


IS 



the rule #]_. The derivation terminates when all brackets are era-red. 

2,1.1 Complexity of segmental recognition and generation. 

Let us say a dictionary J) is i finite set of the underlying phonological 
forms (that is, bracketed matrices) of morphemes. These morphemes may 
be combined by concatenation arid simple substitution (u syntactic category 
is replaced by a morpheme of that category) to form a. possibly infinite set of 
underlying forms. Then we may characterise the two central computations 
of phonology as follows. 

The phono logical genera (son pnoWcm (PGP) Is: Given a completely specified 
phcmologjcid matrix i and a segmental grammar p, compute the surface form 
V - P(>) of x. 

The phonological nccckjn:hcn problem (FRF) is: Given a (partially speci¬ 
fied) surface form y, a dictionary D of underlying forms, ajid a segmental 
.grammar g. decide if the surface form +■■ = j(x) can be derived from some 
under Lying form x according to the grammar g\ where t is constructed from 
the forms in D „ 

Lemma 2.1.1 The segmental model cart Jfmtlfftte the computation of urtt/ 
DTAi M on any input tu r tistnp only elementary rales. 

Proof, We sketch the simulation. The underlying form 2 represents the 
TM inpus tit, while the surface form y represents the balted state of ,1/ 
on hi. The instantaneous description of the machine (tape Contents, head 
position,, slate symbol) is represented in Ihe String of units. Mach unit 
represents the contents of 1 tape square. The unit representing the ■currently 
scanned tape squire will also be Specified for two additional features, Co 
represent the state symbol of the machine and the direction in which the 
head will move. Therefore, three features are needed, with a. number of 
specifications determined by the finite control of the simulated machine Af, 
Each transition of Af is simulated by a phonological mte. A few rules are also 
needed to move the head position around, and to erase the entire derivation 
string when the simulated machine halts. 

There are only two hoy observations, which do not appear to hive been 
noticed before. The first is that contrary to common misstatement in the 
linguistics literature, phonological rules are not technically context-sensitive. 


I'J 


Hither, they are uaKtlritlisd rewriting rules because Lhey -can perform dele¬ 
tions m well as insertions. This is essential lev the reduction, because jt 
allows, the derivation String to become arbitrarily long. The second ob¬ 
servation is that segmental rules can freely manipulate (insert and delete} 
boundary symbols, and thus jt is possible to prolong the derivation Indefi¬ 
nitely; we need only employ a rule R ri - t at the end of the cycle that adds, 
an extra boundary symbol to each Ofld of the derivation string, unless the 
simulated machine b as halted. The remaining details are strajtflitfurward, 
and are tiierefore omitted, I li ne immediate consequences are: 

Theorem 1 PGP is it ndecidablt. 

Proof. By reduction to the □□ decidable problem w £ £(M)7 of deciding 
whether a given T'M M accepts An Input u?. The input to the generation 
problem consists of An underlying form $ that represents w and a segmental 
gram mar g that simulates the computations of M according to lemma 2,1-1. 
Th* Output is a surface; form jj = p{z) that represents the halted configura¬ 
tion of the TM, with all but the accepting unit erased. Q 


Theorem 2 PRP i$ undccitioble. 

Proof By reduction to the und&njdafete problem L(M) = #! of deciding 
whether a given TM if accepts any inputs. The input to the recognition 
problem consists of a Surface form ^ that represents the halted accepting 
State of the a trivial dictionary Capable of generating E", and a seg¬ 
mental grammar g that simulates the computations of the TM according to 
lemma 2,1.1, The output, is an underlying form z that represents the input 
that A 4 accepts. The only trick is to construct a (trivial) dictionary capable 
of generating aii possible underlying forms £*. £3 

bet US now tarn to consider the range of plausible formal restrictions On the 
segmental model, 

2.1.2 Restricting the segmental model 

We consider ways to bound the length of derivations, limit the number of 
features, and constrain the form of phonological rewriting rules, ft$ ^ell as 
their interactions. 


Thfc first restriction is to eliminate complex rules, lu particular, let u£ limit 
complex rules to union and concatenation. This restriction l& plausible (for 
tie purposes of this chapter) became complex rules are used to model non- 
Local phonological dependencies. and these dependencies are now modeled 
by the autoseementaj model, which we examine in section 2.2- 


B omitting. the derivation length 

The next restriction is to prevent phonological rules from inserting. bound 
atles. Ili the SPS& formalism, all rules can manipulate boundaries, which are 
simply those units specified [-f gogmenu]. However, in the grammars actually 
postulated by phonologists., only the readjustment rules manipulate bound¬ 
aries, bo Jet us formally prevent phonological rules from ever inserting or 
deleting a boundary. Now ruEes that manipulate boundaries are properly 
included in the class of readjustment rulcs. fl 

Boundaries must be manipulated for two reasons. The first is to reduce the 
number of cycles in a given derivation, by deleting boundaries arid flatten 
ing syntactic structure, for example to prevent the phonology from assigning 
too many degrees, of stress to a highly-embedded structure. The second is to 
rearrange the boundaries given by the syntax when the International phras¬ 
ing of an utterance does not correspond to its syntactic phrasing (so-called 
“■bracketing paradoxes*), lu this case, boundaries are merely moved around, 
while preserving the total number of boundaries in the string. The only way 
to accomplish this kind of bracket readjustment in the segmental model Is 
with rules that delete brackets and rules that insert brackets. Therefore, if 
we wish to exclude rules that insert boundaries, we must provide .an alter¬ 
nate mechanism for boundary readjustment. For the sake of argument’" 
and because it is not too hard to construct such a boundary readjustment 
mechanism—let US henceforth adopt this restriction. Mow how powerful is 
the segmental model? 

Although the generation problem is certainly decidable now, the recognition 
problem remains undccidable, because the dictionary and syntax are both 

all ieadi uh; raent ruka mauipuLils beundMi'W- Cn gcwrxl, ra adjustment rule* 
!riA|i the mm Uee futirvs given hy the nyniaji Eblg the under Lying fc:ns :1 ?his pLotiology. 
Jst rxnmpln. lacy ire used Ld map abstract no:ph.enivs, suet on Inflection or a^reeim'si', 
into pkojitiinglcal matrices, ami to modify syntactic ealeft»ri«, at wben fifth Jwwi i* 
mapped iiam a naan La Lhe syntax W a HOUR pbrasc in Lhe phonology, it order that it he 
ASyftn^d the wT[«t fin*l-*t(HH. 



potentially infinite sources qf boundaries: the underlying form i needed to 
generate any given surface form according to the grammar ij could be ar¬ 
bitrarily lone and contain an arbitrary numher of boundaries, Therefore, 
tte complexity of the recognition problem i» unaffected by the proposed 
restriction on boundary readjustments, The obvious restriction theft is to 
additionally limit the depth or" embeddings by some fixed constant, (Chom¬ 
sky and iilalie dirt with lids restriction for the lingula Lk reasons mentioned 
above, but view It as a performance limitation!, and hence choose not to 
adopt it in their theory of Linguistic competence.) 

Lomrn-a 24*2 Each derivational cycle wn directly simulate any ptsly no mini 
li’fiie jftemu-fmj funny mdcAine (ATM) computation usury onfy eie menf^irr; 
nits. 

Proof. By reduction from, ft polynomial-depth ATM computation. The in¬ 
put to the reduction is an ATM M with input id. The output is ft segmental 
grammar y and underlying form x sA. the surface form y = g{z) reprc< 
SCnts a halted accepting computation iff M accepts w in polynomial Lime, 
Toe major change frem lemma 2-1.1 is to encode the entire instantaneous; 
description of the ATM sLale (that- is, tape contents, machine Stale, head 
positron! in the features of a single Unit. To do this requires a polynomial 
number of features, one for each possible tape square, plus one feature for 
the machine state and another for the head position, Kqw each derivation 
string represents a level of the ATM compulation tree, The transitions of the 
ATM computation are encoded in a block H as fellows. An AND-trftiisrlkei 
is simulated by ft triple of rules, on e to insert a copy qf the current state, 
and two to implement the two transitions- An OR-transition is simulated 
by a pajr of disjunctively-ordered rules, one for each of the possible suc¬ 
cessor states The complete rule sequence consists of a polynomial number 
of copses of the block B , The last rules in the cycle delete halting states., 
so that the surface form IS the empty string (or reasonably*sized string of 
‘accepting unit*) when the ATM COiItputation halts and accepts. If, on the 
other hand, the surface form contains any nonhalting or nop accepting units, 
then the ATM does not accepl its input in in polynomial time. The reduc¬ 
tion may clearly be performed in time polynomial in the size of the ATM 
and its input. □ 

Because we have restricted the number of embedding? in an underlying form 
to be no more than a fixed language-universal constant, no derivation cat) 


consi&t of more thin a constant number of cycles. Therefore, Lemma 2.1.5 
establishes the following theorems: 

Theorem 3 PGP uurUft founded emhedtftns^ and * * tenit n tury rules is PSFACE- 
hanl- 


Pooof. The proof is an immediate ■consequence of lemma 2-1-2 and a corol¬ 
lary to the Chandra-Koaen-StDckmeyOf theorem {ISSI) that CqU&tCS poly¬ 
nomial time ATM compulations and PSP ACE DTM compulations, [j 

Theorem 4 FRP with founded omiWdinps and elementary it PSPACP- 
hard. 

Proof- The proof follows from Lemma 2.1.2 and the Chandra-Kozen-Slockmeyei 
result. The dictionary consists of the lone unit that encodes the ATM start¬ 
ing configuration (that is, input ui. start State, head on Leftmost square). 
The surface siring is either the empty string or a unit that represents the 
hatted accepting ATM configuration 

There is some evidence that the PGP with bounded embeddings and ele¬ 
ment ary rides Is also inside PS PACE. The requirement that the reduction 
be polynomial lime limits us to Specifying a polynomial number of features 
and a polynomial number of rules. Since each feature corresponds to an 
ATM tape square and each segment corresponds to an instantaneous de¬ 
scription. thus hind of simulation seems limited to PS PACE. ATM compu¬ 
tations. Since each phonological rule corresponds to a next-move relation, 
that is. one time slop of the ATM, the simulation appears further limited to 
specifying FT! ME ATM computations, which correspond to PS PACE DTM 
computations. 7 

For the PR.P, the dictionary (or syntax-i nterface) provides tlm additional 
ability to ivomdetenni rustically guess an arbitrarily long, boundary-free un¬ 
derlying form i with which to generate a given surface form §(i). This 
capacity remains unused in the preceding proof, and it is not too hajd to 
see how it might lead to un decidability/ 

T We ncm be vateliiL Jidwevet, in makr mt assumption.!, about. Lbe mtcjco:,:& 1 mode., 
perfectly crpLicit. Ef apdonnJ rules me entirely unteflHKfed, CIub ?h-*y CM fttHI-ili*!* the 
bounded nondeterniiiiiim and tl.e PGP Could br- tm ■iwnrijslEJt aa tic PK.F, which iaa access 
to the nitioundci in-in.ileterms-,isir: dJ Ibc dicLicnirv. Set footnpus 6-. 

* As we uif abow, the norph r-ne cLc: ion m". /ayutu-i utcrFw provides th.e ability Lb 


23 



Limiting t|ie number of features 

Another coEnpui-atioD-aJ restriction is to limit the number of phonoiogifial 
features, 

] he cumber of phonological features Las a significant effect on the corupu- 
tationaJ complexity of piLonoiogicaJ processes, because etch binary feature 
provides the derivation with a bit of computational space, (This is true even 
lhough only a smalt, fixed number of features- were needed to prove unde¬ 
cidability of ] ] RP and PGP; in theme reduction^ each segment wimulftted 
ft tape square.) As Chomsky and Halle noted, the SPE formal system is 
most naturally seen as having a variable (unbounded ) set of features and 
specifications. This is because 3anguac.es differ in the diacritics they employ, 
as well as differing in the decrees of vowel height, Lone, and stress they ah 
low. Therefore, the set of features must be allowed to vary from liirrigtiage 
to language, and in principle is limited only by the number of rules in the 
phonology' the sot of specifications must likewise be allowed to vary from 
language to language. 

Yet there is an important dlstincLion to be made betWWfc the diacritic fea¬ 
tures and the phonological features. Diacritics are properties of matrices, 
whjie pliGnologmaJ features are properties of units. Diacritics are used only 
to control the derivation—they are never affected by it. No rule ever rewrites 
the value of a diacritic, So even though the phonology must have access to 
an unbounded number of diacritics, it cannot perform unbounded compu¬ 
tations on them. 

That leaves us with the phonological features, which we can further sepa* 

80Hd*[£[iU]jiistLca]£jr goes* an, lilbilrarjty Ling unrfarljrfag form t with wtkh tu gKcirrate 
a given eutlacE Carol Jl>> (in tti# CdqIeiI oF thn PRP only}. We can hsu-uts* this poni 
if v<t can encode Lbr euLire TM cwipuurinn in a single begmnntHl wtnag. Ctunple* rulea 
'Ttll ensure that * given vnd eriyi nfl farm describes a legal wropuUtJoi airing. l*t UUlCb 
flWJDde laps squares. u Ln the prooF af Lemm * ?, ] 1 . requiring a fived number ftffeatn**, 
(If> "vb let each, unit *r.c#« a. complete tn^tuitaa-Biruii d esc rip Irani.. Be in the proof 

af ismifli 2.1.2, (hEB »-e will |u« pr^ed that the PRF i* EJCPPOLY time hard, using an 
unbauDUEd number n[ EeiI’jies.;i As before, the dktLcoiaiy gne.eiatcf all itiing* <>f unita. 
CWEeapaEdirig in iij pcnulble compnlAtinns. The seeir.Enlai grammar ronniiis of three 
atigEf. In Lbe first stage, -OpLuaiat rules BandclVciniruEticaJLy specify fbe units; in the 
second stag* idjacenL uniLs ate ch«ked to Ensure that Lh*r obey the next-wm celatjon 
at lh.E . if, and if they don't tbE>- are mufcfd iaillegal: in Lbr thild Stage, tbe computation 
HtrLag ia cednnnd to a single unit, which ia either marled as ill+gil or as reprtiejittng a 
hilled accepting stain of iht UlbcJuge The redact fan ia considerably simpler using (he 
nonlara] Pott feTvnling specified by CcjmpLtx risks. 


2d 



rale into two classes: articulatory and suprasegmental. The supraugmentat 
features. such as degree of stress,, is the topic of the next section. So let os 
examine die articulatory features here. 

There Hje a fixed number of articulatory features, on the order of 10 to 15, 
determined bv the musculature of the vocal apparatus. It is also true that 
there is an upper limit to the number N of perceivable distinctions that any 
one phonetic feature ia capable of supporting. Therefore, it is Wl empirical 
fact that the number of passible phonetic segments is fixed in advance for all 
human languages. And from a. computer science perspective, this number is 
small, between a thousand and a. million. 

If we take these empirical upper bounds seriously, then the PSPACE sim¬ 
ulation described ici lemma 2,1-2 would no longer be allowed, because it 
requires an unbounded number of features. Although the undecidable TM 
simulation in lemma 2,1.1 would not bn affected by this constraint, because 
it only requires a (very small) lined n um ber of features, this simulation is 
independently excluded by the proposed Limit on the number of embeddings. 
However, usirij* the ideas iu footnote 8, we would still he able to prove that 
the FR.P is undecidable. 

On the one hand, this fact provides a. convenient, constraint on the complex' 
ity of the PGP, that can be defended solely OH empirical grounds. On the 
other hand, this fact does not lead to any insight or understand in g. and it 
is difficult to defend such a trivial constraint on scientific or mathematical 
grounds, 

TO fix the number of segments complicates the Linguistic theory needlessly, 
requiring an extra statement whose only consequence is to hJocJt a certain 
class of complexity proofs, and inhibit the search for more sigELifLcant con¬ 
straints, It has no other consequences, and if a now feature was discovered, 
the bound would only be revised upwards. Therefore, the proper scientific 
idealization is that the phonology has in arbitrary articulatory base (as seen 
in sign languages). 

The purpose of this report is to illuminate the computational properties of 
human Language, as a function of the natural parameters of variation, using 
the tools of computer science. We know that the number of phonological 
distinctions (that is, features and specifications) is a natural parameter of 
the phonology, because it varies from Language to Language. We also know 
that this parameter affects both the computu-tiOLid Mid informational COSIV 


25 


pkxity of the phonology, and for that reason it must be included in an 
kft&est mathematical analysis. To exclude it from consideration is only to 
blind ourselves to the computational Structure of the system, Therefore, 
(■lie idealization to an unbounded number of features is necessary on. purely 
mathematical grounds. 

Such a fixed bound le also at odds with a fundamental insight of the £PE 
system. A central component of SPE is the complex rule formalism, winch 
characterises the class of linguistically significant generalizations. The cen¬ 
tral difference between complex and elementary rules is that complex rules 
naturally describe nonlocal dependencies, whereas elementajy rules are lim¬ 
ited to describing local dependencies. The central prediction of an evaluation 
metric defined on the complex rules, then, is that nonlocal phonological de¬ 
pendencies are as natural as local dependencies. Jn other words, (lie class of 
phonological dependencies cannot naturally be encoded with a finite-state 
automaton. \et, when we fix the length of derivations and maximal number 
of features, we limit the phonology bp only describing fiLite-State dependen* 
cies. 

1 return to consider the nature of unbounded ideatizitians in a more general 
setting in appendix A. 2. 

To rny mind, the most telling reason not (0 fix the number of articulatory 
features is that it is 4 jejune constraint that does not itself lead to any un¬ 
derstanding. Even worse, it distracts us from the truly interesting research 
questions, such as whether features do- in fact correspond to reusable com¬ 
putational 5pMfc in the phonology, as we have used them In the reductions. 
When we block the reductions by fixing the number of features, we do not 
answer this question. All w G do Is make the question irrelevant, because to 
fix the features j* simply to ignore them, So, fur the purpose of iticreasing 
our understanding of human language. Set us keep our idealization to an 
unbounded number of features, and proceed with the Investigation, 


Restricting the rewriting rules 

In Order to hound the time resources available to the phonology, we have 
considered limiting the number of derivational cycles directly, and indirectly, 
by restricting the class of readjustment rules. We have abo examined ways to- 
bound the space resources aval [able to the phonology, by limiting the number 
nf phonological features, Although these purely forma] restriction.* block 


20 


a class of reductions.., and const rain the dass of jilLOOOlcgmal computations: 
(arid thereby the cla^R of characterizabte phonological dependencies), neither 
suffices to eliminate the Intractability that is Inherent in aa unrestricted 
rewriting system. This raises the question, how might wr restrict The rules 
theiu solves? 

Elementary rules are used in at least si\ ■e'&.yai (1) to convert binary phot*- 
logical features to n-ary phoiteiLC features, for cxampEe. the nasal feature in 
SPE; (ii) to TTiahe a unit agree or disagree with the features of an adjacent 
unit, that is, to represent assimilation. and dissimilation processes; (ill) to 
insert Units that are entirely predictable, as in English epenthetic vowels; 
(iv) to delete units that violate welhformedness conditions on reprtienta- 
tjojifl; (v) to swap two adjacent units, which is caUed metathesis; and ^vi) to 
derive irregular surface forms, as in the English YW + <past> — [ipenf]*. 

Abstract words arc rewritten, as segmental strings rn l he interface between 
syntsuc and phonology {sne chapter 3). The derivation of irregulaT ?Jid regu¬ 
lar forms are identical from this perspective; both arc simply the arbitrary 
rewriting of abstract morphological constituents into segmental strings. The 
first restriction on elementary rules, then, is to limit the class of arbitrary 
rewritings to the interface between phonology and morphology, and So ban 
the arbitrary rewriting of segmental string* from the phonology proper. 

Rules that delete, change. exchuLge-, or insert segments—as well as rules that 
manipulate boundaries —are crucial to phonological theorizing, and there¬ 
fore cannot be crudely constrained. 5 More subtle and indirect restrictions 
are needed for these rules, 10 

s Ont rtKCulctiDU proposed m llie literature, it McCarthy** (1981:4115) “morpheme -rule 
cojuLs-aiziE"' (MRC% which r«|iii:?9 ail rciurphshjgik&l tides to be of the form A =— HjX 
where .4 is a unit ot 4>, and B and X arc (possibly null} aiiiafs of tinito. (A' is the imme¬ 
diate context of A. mj the right oe left.t The MELC ic-cs udi con a train tli* cotnpaiatkKn.il 
■complexity uf seyinenCtl phonology because inrilrLdiiriJ tlI*» can aliSl insert arc! delete 
Moments, ami groups of Fair.!; -ran bn cixircina.cd to peif-Oilfil arbitrary rewriting, 

w Thai Chomsky and Haile were well aware oE these pEoWcfus is beyond CQUbtl “A 
possible direction in wtiidi one micht boh for hW.Ii hii extension or thr: theory is evft&tsioif 
by certain ether (acta that ere HOI handled with complete adeqaacy La the present theory. 
Cunaider first the majintt in wluch the process of metaihcFis wan Lrealed in Chapter 
Ettjit, Section 5 . Ah Twill be reealkil, we were forcM there to take adv-ajiLa^e of powerful 
Liantfoim.uioa&l himIi iarxy of the eoeL that is used in the syntax, Thia increase n the 
power of the formal device* cf phonology did not seetu fatly justified since it was. made only 
to handle n marginal type of phenomenon. An alter native way Id achieve thn same rcsulLs 
is fa introduce R special dr-eke; which weald be interpreted by the eon. veil 1 ! ion:- on rule 
application as having the effect of permuting the sequential order of a pair of segjDenls." 


21 



One indirect restriction is io limit the possible interactions amonp rules. 
Because segmental grammars do not have a finite state control, ajj rule 
interactions must arise via th<- derivation form (that is, the sequence of s-eg- 
nentaJ strings that is tie computation string for the segmental derivation), 
Th« COmpulationaJJy significant interactions are ones that use the derivation 
form to store intermediate results ol computations. The segmental model 
allo\v '5 one rule to make a change in the derivation form, and a subsequent 
rule to make a change to this change, and so on. A segment that is in¬ 
serted can subsequently be deleted; a segment that is switched with another 
segment can subsequently be switched with another segment, or deleted. 

We have every reason to believe that such interactions are not natural The 
underlying form of a word must encode ah the information needed to pro¬ 
nounce that word, as we]] as recognize it. This information must he readily 
access] hie^ in order to ease the task of speaking, as well as that of acquiring 
the underlying forms of new words. The underlying form of a given ^. OE -(l 
that representation that omits all the directly predictahie information in the 
surface form. The methodological directive "'omit predictable information” 
means that a feature or segment of a surface form must he omitted if It is 
directly predictable from the properties of the phonology at a whole (such as 
the structure of articulations or the segmental inventory), or from the prop¬ 
erties of that particular surface form, such as its morpheme class, adjacent 
segments, or suprasegmentai patterns. To a first approximation, “directly 
predictable"' means “‘computable hy one rule with unbounded Context and 
no intermediate results-* 

In point of fact, insertions and deletions, do not interact in the systems 
proposed hy phnnoJogiitE. Units are inserted only when they appear in 
the surface form, and aje totally predictable. Such units arc never deleted. 
Because inserted units aren't deleted, and because an underlying form is 
proportional to the size of its surface form, the derivation can only perform 
a limited number of deletion*, bounded hy the SLEC of the underlying form. 
In general, deletions typically only occur at boundaries, in order to "fix-up” 
the boundary between two morphemes. Because underlying forms cannot 
consist solely of boundaries, we would enpcct the size of an underlying form 
to be proportional to the size of its surface realization. 

The immediate consequence of this ^direct prediction'' property of segmen¬ 
tal rules is that underlying forms cannot ■contain significantly mere segment* 

(MiJ) ' 


2S 



or featares than, llieir corresponding surface forms. It is aJso true that the 
derivation sdds predictable information to the underlying form in a nearly 
monoLonic fashion. The nest restriction is to severely limit rule interactions 
in the segmental model: to exclude the storing of intermediate results in 
derivation forms, to require all derivation paths to accept and to be nearly 
monotonic. Deletion phenomenon might be modeled using a diacritic that 
h locks the insertion of the ^deleted” segment, Tim details of such a (nearly or 
strictly) monotonk model must of course be worked out- Hut it is promising, 
and if plausible, as it seems to be, then the simulations in footnote 8 would 
be excluded. This is or.e formal way tu define the notion of "predictable 
information , 71 which, based as it is in the fundamental notion of computa¬ 
tionally accessible information, seems more coherent arid Fundamental than 
the notion of a ^lingUtsticaHy-significant generalization.’' which has proven 
elusive. 


2.L3 The SPE evaluation metric 

The S?E evaluation metric is a proposal to define the notion of a natural 
rule and Linguistically-significant generalisation: At first glitnCC, this pro¬ 
posal seems vacuous, In order to minimize the number of symbols in tbe 
grammar, observed surface forms should simply bo stored in the dictionary 
of underlying forms. Then the number of symbols in the grammar is aero, 
and all the linguistically significant generalizations in the corpus have been 
discovered, that is, none. Clearly, this is rot what Chomsky and Halle 
intended. 

Perhaps tbe sise of tinl dictionary must be included in the metric- ns well, 
Mow the most natural phonology is the smallest grammar-dictionary whose 
output is consistent with the observed corpus. The solution to this problem 
is also trivial:, the optimal EJa]T:mar-dicticlnarJ , simply generates S'. 

So the simplest coherent revision of the SFE metric states the most natural 
phonological system is the smallest grammar dictionary that generates ex¬ 
actly the finite set of observed forms.” Ignoring questions of feasibility (that 

“In this we idc.pt a standard assurrpt;on of the field, that the language arednvmr. 
drvke does not have access to negadw eridflt-f«, If ntR^livc evidence were allowed, then 
tilt SPE metric would be revitad to state that the most natural system :s the BJoalksf 
Statnmar-d.icLioc.aty cocsisconL wiLh the evidence. that is, accept* all positive example 
util ifscO ail necativs ex amp lei. L'his approach Is premising under the weak definition 
of ‘'negative evidence' is “ibHenee of cottfirtnliiOnT 


39 



L6. Ilcw to algorithmically Hud such a system), we mu jcito •erlous empirical 
problems because the observed corpus ii always finite. The smallest, grim- 
tnar wjU always take advantage of this nnlteness, by discovering patterns 
not yet falsified by the set of observed Surface forms.. The underlying forms 
in such an optimal grariQHiAr-dktionary system will in fact loot nothing like 
the true underlying forms, that is, those postulated by phonolngLsts on the 
basis of scientific evidence (hat is not available to the language acquisition 
device (LAD), And even if tbe set of underlying fauns ts fixed, the optimal 
grammar in inch a. system will still not be natural, failing standard empirical 
tests, inch is those posed by loan words and Iangu&gc change. 

Tbi-S Observation is confirmed by tbe complexity proofs. An important corn], 
liny to lemma 2.1.1 is lhat segmental grammars form a universal basts for 
computation. It is possible to simulate an arbitrary Post tag system using 
a vory simple set of phonological rules. Or we can simulate the four-symbol 
Seven-state -"am attest universal Turing machine” of Minsky (1%7) in the 
segmental modelj the respiting grammar contain h no more than three fra- 
titftt, eight specifications, and 3fi trivial rules. These segmental grammars of 
universal computation contain significantly fewer symbols than a segmental 
grammar ter any natural language, And ibis jg not even lbe best tb&i can be 
done, The smallest Combined grammar-dictionary for the set of idl observed 
words wilJ be even e mailer, because it can take advantage of all computable 
generalizations among the finite set of observed surface forms, not only the 
linguistically significant ones. In fact, the attached dictionary would repre¬ 
sent the Kolmogorov complexity of the observed surface forms with respect 
to the optimal segmental grammar, that Is, the true information content of 
the observed, surface forms with respect to an arbitrarily powerful encoder. 
Therefore, this corollary presents -Severe conceptual and empirical problems 
for the segmental theory, 

In short, even if we ignore questions of feasibility, the .smallest segmen¬ 
tal grammar-dictionary capable of enumerating the set of observed surface 
forms cannot be natural because it must discover too many unnatural gen- 

j1 ]ji Jny brief experience as a piionola^iKl:, the BUtatn alural srunmtii did not hive the 
numberoJnymtols.e»vn when the pmjtrr morphemicdecrympodiJac of undetlring 
form* W4S kna-Tm in advance. With cr.Dugh time frmi mental discipline, it wu 
passible coniLruct a «m*jkc gru&tttt than the *cort«t" one, hf Xaiitifc advantage of 
“vnn.atTiirJ PkXterras in Lbe abwtved iilifaoe farms. IncrCMing Lhe number at rTimpks 
clper not help, ulmply because th*tt Will never he enough example* to exclude ail the 
computable bur unnatural paLteini. 


30 



eraiLaationss- 


How then can we make sense of the STE evaluation metric^ The evaluation 
metric makes certain sets of disjunctively ordered alamentwy rules as nat¬ 
ural as an elementary rule. The fundamental difference between a complex 
rule and an elementary rule Is that a complex rule is> capable of performing 
noplocal Pnst-slvle rewriting, whereas elementary rules arc limited to local 
Thue-Sdyle rewriting. Therefore- the 5PE evaluation metric formalizes the 
observation that nonlocal phonological dependencies can be as natural as lo¬ 
cal ones. The only difficulty is, the relatively subtle distinction hetween local 
and nonlocal rewriting is overwhelmed by the brute power of an unrestricted 
rewriting system to encode arbitrary r.e. dependencies, 

This observation Suggests a solution, and a promising line of investigation. 

The early evaluation metrics included not only a measure of the number 
of symbols in the grammar (Kolmogorov complexity), but also a measure 
of the length of derivations (time complexity), So, in MorpAojfchoneinics o/ 
Afotfc™ dfebrtw, Chomsky {19^1) proposed a mixed evaluation metric: 

“Given the fixed notation* the criteTia of simplicity governing 
the ordering of statements are as follow.^ that the shorter gram¬ 
mar SS the simpler* and that among equally short grammars, 
the simplest is that in wliich the average Length of derivation of 
sentences is least. 5 " 1 (pjfi) 

“The Criteria for justification of ordering are as given at the 
conclusion of section 0: simplicity is increased by 1. reduction 
Of the number of symbols in a statement (paired brackets, etc., 
counting at one symbol); 2. reduction of the length of deriva¬ 
tions- With the second requirement subsidiary. Actually- it ap¬ 
plies only once, and then in a trivial fashion, t mention Lt -only 
to indicate explicitly tkat this consideration, taken us subsidiary, 
will not materially increase the ordering restrictions.^ (pp.oL-2) 

A similar proposal appears in Sf*E, in the context of how the conventions 
of conjunctive and disjunctive ordering should be applied; 

A natural principle that suggests itself at once ie this.: tafifiosct- 
riXri-ry NwJii{zr»rt» ?n t).*l Ik. elected :.'S such a nay os to mammicc 
cftsyuTicliue ordcrirty- The principle seems to ns a natural 
one in that maximization of disjunctive ordering will, Ln general. 


31 


minimize the length of derivations in the Grammar. The question 
of how an internalized grammar is used in performance Speech 
production or perception) is of course quite open. Neverthe¬ 
less, at seems reasonable to suppose that the grammar should be 
selected in a such a way as to minimize the amount, of "nompu- 
talion’ that is necessary, and that "length of derivation 1 is one 
factor in determining Complexity of computation'. Naturally, 
this principle must be regarded as quite tentative, We will ad¬ 
here to it where a choice arises, but we have very little evidence 
for or against itr To fmd empirical evidence bearing on a princr- 
pk- of Ibis degree of abstractness is not an easy matter, but the 
issue is important, and one should bear it in mind in a detailed 
investigation of phonological structure. fSPE. p,63). 

As slated the addition of complexity concerns does not make a difference, 
because the derivation-length is strictly ordered withjn the symbol-count. 
The idea of combining Kolmogorov and computational complexities is at¬ 
tractive, however, for all the reasons mentioned. Let us therefore replace the 
SFE/Morphophonemics metric with a metric inspired by the linn?-bounded 
Kolmogorov complexity of Levin (1973), Let tiroefo, u) be the number of 
steps taken by grammar £ to produce surface form gi{ ir) on input underly¬ 
ing form u. 'Then the complexity £($■, D) of a phcmcJogical grammar ? and 
dictionary of underlying forms D is: 

K(g, D) = fs| + |t:| + time(j^) 

ueD 

A grammar-dictionary ip,, j '- 1 i ) is pri-foricd over another grammar- diction ary 
(?3i^2) is complex D a )); and (ii) both 

aro extension ally equivalent : u e A} = {gr-sfu}: t! e r 2 }). 

Now it is possible, at least in principle, to find the optimal grammar- 
dktfonary grammar, by a simple sequential search. And the phonology is 
OEdy able to discover efficiently computable and simple patterns, a potential 
improvement over the SFE proposal. 

Although this evaluation metric solve? the technical problems of earlier pro¬ 
posals, it is not clear that it would result in natural grammars. In my 
opinion, the notion of a '"linguistically significant generalization" is best for 
maliaed by postulating a weak encoder, that can only discover linguistically 
significant generalizations. The evaluation metric, rather than the symbol 


\U 


count of the grammar, is- the minimum description length Criterion applied 
to the set uf observed surface forms, which are encoded relative to the model 
class of restricted direct-prediction phonologies (cf., Rjssanan 1&78), That 
wav, in order to minimise the encoding of the set of observed surface forms, 
the grammar mu$t discover the linguistically significant generalizations after 
having seen a Sufficiently large (finite) set of surface forms- The grammar 
is also forced to describe phrase-level plioiiologlcaf processes, because no 
finite wetd,/morpheme dictionary is capable of doing. SO. The research chal¬ 
lenge. then, is to design such a probabilistic model class for phonology. The 
modern autosegmental model seems to be a good place to start, because 
Euprasegmental processes are a large class of the phonologically significant 
generalisations. 


2,1.4 Modeling phonological dependencies 

In SPE, knowledge of phonological dependencies, is modeled with an unre¬ 
stricted Post-style rewriting system. Such a system is capable of encoding 
arbitrary ue. dependencies In the derivation of surface forms front underly¬ 
ing forms. It forms a universal basis for computable knowledge. 

We know that phonological dependencies can be complex, exceeding the 
capacity of a finite-state encoder, However, not every segmental grammar 
generates a natural set of sound patterns. So why should we have any faith 
or interest in the formal system? The only justification far these formal 
systems then would seem to be that they are good programming languages 
for phonological processes, that clearly capture our intuitions about hu¬ 
man phonology. Hut segmental theories are not always appropriate. Their 
notation Is constrained, which limits their expressive power. Interactions 
among phonological protests are hidden in rule ordering, disjunctive or¬ 
dering, blocks, and cyclicity. Yet, despite these attempts to formalize the 
notion of a natural phonological dependency, it is possible to write a seg¬ 
mental grammar for any recursively enumerable set- 

Natural phonological processes seem to avoid complexity and simplify in¬ 
teractions. It is hard to find a phonological constraint that Is absolute and 
inviolable. Tliere are always exceptions, exceptions to the exceptions, and 
so forth. Deletion processes like apocope* syncopy, cluster simplification and 
stray erasure, as well as insertions, seem to be motivated by the necessity 
of modifying a representation to satisfy a phonological constraint, not to 


Xi 


exclude representations, to hide vast computations., or to generate complex 
sets, as we have used them here. 

The next step iu the research program initiated here is to design an appro- 
priate forma] phonological model, along the lines discussed above, in order 
to answer the fundamental questions of naturalness, appropriate geueraiiza 
tiotl, and what seeuts to be the lynchpin of the phonology, the omission of 
directly predictable information, It is also now passjhJe to discuss the notion 
of ccunputationally accessible information, which Ilie played such an impor¬ 
tant role in modern cryptography^ and to consider a more sophisticated—yet 
still constructive—complexity thesis for human language, based on the fun¬ 
damental ideas nf entropy and computation. We might hypothesis!* that 
knowledge of phonology, ar.d of ail Linguistic dependencies, is computation’ 
aJ]y accessible in the sense of Ya* (lfl&S), 


2.2 Auto segmental phonology 


In the pact decade, generative phonology has seen a revolution in the Ian- 
guiatlc treatment of suprasegmenta] p ro cesses such as tone, harmony, in- 
fixation/interleaving, and stress assignment. Although these autosegmental 
models have yet to he formalised, they may be briefly described as follows. 
Rather lhan one-dimensional Strings of segments, representations. may be 
thought of as “a three-dimension id object that for concreteness one might 
picture as u Spiral-bound notebooky whose spins is the segmental string 
and whose pages contain simple constituent structures that are indepen¬ 
dent of the spine (HaJJe 1985}. One page represents the sequence of tones 
associated i"■ 1 l a. given articulation- By decoupling the representation of 
tonal sequences from the articulation sequence, it is possible for segmental 
sequences of different lengths to nonetheless be associated to the same tone 
sequence. For example, the tonai sequence Low-High-High, which it used by 
English speakers, to express surprise when answering a question, might be 
associated to a word containing any number of syllables, from two ( Bmzii) 
to twelve [fiord rtuuccirti'ftillp Hifka two) and beyond. Other pages (called 
"planes*) represent morphemes, syllable structure, vowels and consonants, 
and the tree of articulatory (that l&> phonetic) features. 


34 


2.2,1 Complexity of antoseginentol recogniiloa 

Now we prove that the PRP for au tn&egmental models is NF-hard, a -sig¬ 
nificant reduction in complexity from the nridecidable ami P5 PACE-h&rd 
computations- of segmental theories, (Note however that autosegmnnta! rep¬ 
resentations have augmented—but not replaced "- portions of the segmental 
model, and therefore, unless something c&ft he done to simplify segmental 
derivations, modern phonology inherits the intractability of purely segmen¬ 
tal approaches.) 

Lei us begin by thinking of the NT-complete 3-Satisfiability problem (38 AT] 
as a set of interacting constraints. In particular, every salisfiabie Boolean 
formula jn j-CNF is a siimg of clauses CiiCs,---i,C P in the variables n s t ..,3 n 

that satisfies the following three constraints; (i) negation.: available Xj and 
its negation 3Tj have opposite truth values; (ii) clausal satisfaction: every 
clause C, - (a, V hi V a) contains ?. true Literal (a literal is a variable or its 
negation); (iri) consrstcncy of truth assignments; every positive literal of a 
given variable is assigned, the same truth value, either l or n. 

Lemma 2,2.1 .4 utostxpueniul re.jiffrscntahons can enforce the SSAT CO>V 
xtminis. 

Proof, The idea of the proof js to encode negation and the truth valu.es 
ol variables in features; to enforce clausal satisfaction with a local luloseg- 
mental process, such as syllable structure; and to ensure consistency of truth 
assignments with a nonlocal aut&segiueptal process, such as a nonconcate- 
native morphology or long-distance assimilation (harmony). To implement 
these ideas We must examine morphology, harmony, and syllabic structure. 

Morpheme Interteaving. In the more familiar languages of the world, such 
as Romance languages, words funs formed primarily by the concatenated of 
morphemes, in other languages, such us the Semitic languages, words are 
formed fey interleaving the segments of differ our. morphemes. For example, 
the ■Classical Arabic word foliate, meaning 'he wrote’, is formed hy inter¬ 
leaving (with repetitiop) the 1 segment of the active perfective morpheme a 
with the 13 segments of the kib morpheme (cf., McCarthy 1981). (Con strain ts 
on syllable structure, discussed below, explain why the 1 underlying vocalic 
segment /a/ appears ,1 times in the surface form.) In the autofiegmentaJ 
model, each morpheme is assigned its own plane. We can use this system 
of representation to ensure consistency oftruth ussigmonls. Each Boolean 


35 


viable z, is represented by & separate nmrphtmo y 31 and every liters] of 
]fl the String -of formula, literals [■, associated to iheone underlying morpheme 
Vi- 

Harmony. Assimilation is the common phonologies] process whereby some 
segment comes to share properties of an adjacent segment. In Englisli, conso¬ 
nant nasality assimilates to immediately preceding vowels. Assimilation also 
occurs across morpheme boundaries, as the varied surface forms of the prefix 
fn= demonstrate: sn-f tokmUe — *■ intolerable, hqt ftj-f illogical and 
m+protefclt —» iflipnfeUe, In other languages, as-similation is unbounded 
and C-Ml affect non adjacent segments: these assimilation processes- are called 
harmony systems, in the Turkm languages all suffix vowels ass imitate the 
backneasa feature of the Last stem vowel; in Capanahiia, vowels and glides 
that precede a word-final deleted nasal (an underlying nasal segment absent 
from the surface form) are all nasudized. In the aiitnsegrcicntd; model, each 
harmonic feature is assigned Us own plane. As with morpheme-interleaving, 
we can represent each Boolean variable by a harmonic feature., and thereby 
ensure consistency of truth assignments. 

Syllable .«£«tclnnr. Words are partitioned into syllables, Syllables are the 
fundamental unit of segmental organization (Clements and Keyser, 19f?3). 
Each syllable contains one or more vowels 'V' (its nucleus) tlLat may he pre¬ 
ceded or followed by consonants ‘C 1 . For example, the Arabic word kaAa.ho 
consists of three two-segment syllables, each of the form. CV. Every seg¬ 
ment is assigned a sonority value, which (intuitively) is proportional lo the 
openness of the vocal cavity. For example, vowels are the most sonorous Seg¬ 
ments, while stops such as />/ or /b/ are the least sonorous. Syllables obey 
a language-universai sonority sequencing constraint (SSC), wl L leh states that 
the nudeus is the sonority peak, of a syllable, and that the sonority of ad¬ 
jacent Segments swiftly arid raonoionkally decreases. We can use the SSC 
to ensure that every clause C, contains a true literal an follows. The central 
idea is to mike literal truth correspond to the Stricture feature, so that a 
true literal (represented as & vowel) is more sonorous than * false literal 
(represented as a consonant). Each clause Cy = (a,- V h, V e,) h encoded 
4$ a segmental String i — x a — — x C) where 4? is a consonant of sonority 

I, Segment i a has sonority 10 when literal i-. is true, 2 otherwise; segment 
** has sonority 5 when literal b , is true. S otherwise: ar-d segment a: f has. 
sonurity 8 when literal a is true, 2 otherwise, Of the eight possible truth 
values of the three literals and the corresponding syllabifications, only the 
syllabification corresponding to throe false literals is excluded by tins SSO. 


la that case, the corresponding string of four consonants C-C-C-C has the 
sonority sequence 1 -"2-5-2. No immediately preceding or following segment 
of any sonority can result in s, syllabification that obeys the SSC, Therefore, 
all Boolean clauses mast contain a Line literal- □ 

The only fact needed to obtain an N F- hard ness result from this lemma 2.2.3 
is the fundamentally tlliplica! nature of Speech, as described by Jikobstm 
and Haile (195G): 

Usually ... UlC context and the situation permit US to disre¬ 
gard a high percentage of the features, phonemes and sequences 
in the incoming message without .jeopardising its comprehension. 

The probability of occurrence in the spoken chain varies for dif¬ 
ferent features and likewise for each feature in different tests. 

For this reason it is possible, from a part of the sequence, to 
predict with greater or lesser accuracy the succeeding features, 
to reconstruct the preceding ones., and finally to infer from some 
features in a bundle the other concurrent features. 

Since in various circumstances the distinctive load of the 
phonemes is actually reduced for the listener, the speaker, in 
Ilia turn is relieved of executing all the SOUUd distinctions in his 
message; the number of effaced features, omitted phonemes and 
simplified sequences may he considerable in a blurred and rapid 
Style of speaking, The sound shape of speech may be no less 
elliptic than its syntactic composition- Even such specimens as 
the slovenly /tern mins sem/ for "ten urinates to seven’, quoted 
by D. .Tones, are not the highest degree of omission and fragmen- 
taripess encountered in familiar talk, (pp.5-0) 

The direct consequence of lemma 2.2,1, and the fact that not all sound 
distinction* are executed, and those that are may be corrupted., is: 

Theorem 5 FRF /ur (he avtongmentot model is NP-Ucrd, 

Proof, By reduction to 35AT- The idea is to construct a surface form that 
completely identifies the variables and their negation or lack; of it, but does 
not specify the truth values of those variables. That is, the stricture feature 
has been elhpsed. The dictionary will generate all passible underlying forms 
(interleaved morphemes or harmonic strings), one for each possible truth 


assignment. and the autosegidenlal representation of Lemma 2.2.] will ensure 
that generated formulas are In fact satisfiibk. |~~l 

2,2.2 5 npraseg mental dependencies 

It le inform atiw to recX-amine these suprasegment at processes. from an 
information-theoretic perspective, The relationship between the aotmd of 
Hr word and its meaning is L&hSMIltly arbitrary, A given sequence of articu¬ 
lations could in principle mean anythingL a given meaning could in principle 
have any articulation. And it seems that the storage capacity of the human 
brain has, for aU. practical purposes, no limit (of,, Lucia 1 ■&£&). 

Yet there appear to be two primary sources of phonological systenjaricity, 
that is, of souud patterns hath among aild Inside surface forms. 

The phonetic, systematic:^ in the merited lexicon arises from the fact that 
word a consist of morpheme combinations. Although words that share mor. 
phome* need not in principle share phonological patterns. most in fact do. 
This makes it easier to acquire words, and to invent them, because the mean¬ 
ing of a word is given by its relation to other words, as well sb by Its intrinsic 
content, A regular mapping from morphology to phonology simplifies the 
acquisition and invention of new words. 

The phonetic systematidey in a surface form is due to suprasegmental pri> 
ceases, A suprasegmentai proccES p establishes the domains within which 
One edement in each donan. the head, is distinguished phonetically from 
the other nonhead elements in tlLat domain. There are three parameters of 
variations 13 

1- The phonological representation r on which p operates, including syl- 
labies ajid any node in the tree of articulatory features. (The elements 
of f are called the > bearing units.I*} 

2, A restricted class of trees, whose leaves are attached to the p-bearing 
units, Each nonterminal in such a tree immediately dominates one 
head and a (possibly empty) set of nonheads, thereby representing 
a 5upraseg!uenta] domain. The domains defined by these trees are 
Contiguous and exhaustive. 

IJ rhjs inoompkl;* proposal is inspired by the HaUe- and Vergn^twl ( 1 A 87 ) irostliMit <rf 
plmPcilagicai sticis and by conveTsatior.s with. Morris Haile. 


ris 



3, Tins entirely local process that realizes the abstract distinction, between 
heads and nonheads in the phonetics. 

Soprascgmentai processes maintain systematic distinctions between adjacent 
segments Of each Surface form, its well as ensuring that segmental strings 
have global properties, and thereby contribute to efficient production and 
error-correct:ng comprehension (cf. Jakobson and Hallo, 19oG)- 

SylLables organize the phonological segments of a given Language, A string 
of segments is a possible sound if and only if it cap be partitioned into a 
sequence of substrings, each of which corresponds to a permissible sylla¬ 
ble. Syllables represent, in part, the sonority hierarchy between the nucleus 
vowel and its conKMPtal onset ajid coda. Syllabic domains are given by 
the Ianguage-urnversal sonority hierarchy as well as by language-particuLar 
constraints, that may be represented with a small set of syllable templates, 
such Ps ( CV and L CVC r . 

1 .Segnants; are the syllabi a-boaring units, 
a. local sonority maxima are inherent heads ■ 

2.Compute syllabic domains. 

3 . P erf om Isgniental adjustments. 

a. insert. or delete units to satisfy constraints. 

The Strcsa-bcartEg units of a Language- are eititer fenfire -Syllables, or their 
morir,. 14 Stress domains are defined by a class of finite-depth trees, each of 
whose levels are characterized by four language-universal parameters: fee’ 
are bounded or unbounded; the head of a foot is left-terminal, medial, o: 
fight-terminal; and feel arc constructed left-U>-right Ot right-tO-k-ft (Halle 
and Vergnaud, 1337). Kor example, the EngLish ftiiprasegmentat stress pro¬ 
cess would be described as; 

1.Syllabi* heads are the stress-bearing units, 

a. unmarked: heavy ultima is inherent head. 

b. narkad: fifes vy syllable is inherent head. 

2. CoE,pute stress domains. 

3, Perform phonetic adjustments. 

3+ Mi>iiA are the unite of ay Dibit «ti.|3ii; a hen** pliable taaa morns than. a. ligM 
syJUibLc. TechnwaDy, * mora h * wameni dominated by the syilnfcJe Bucket, Ctemtius 
and Keyssr (3SS3). . 





a, shorten head of open word-medial nc-nheac syllables. 

b. chsatre-Ss head adjacent to mora prominent head, 
u, reduce stress!ass Tamil to shva. 

T3ie assimiSation-bear ii)g limits -of a language are these nodes-of the articula¬ 
tory tree that interact with assimilation processes, including harmonic arid 
blocking features. Long-distance harmony corresponds to an liribcninded 
domain of assimilation. A prototypical assimilation prnnesa might look [ike: 

1. Articulatory nod* n is the assimilation-bearing unit- 

a.aasiniilati-un and blocking jfassures are inherent heads. 

2. Compute assimilation domains, 

3 .Parfora phonetic- adjustments. 

a .spread nude n iron, head sfig^ant tu its domain. 


Ihe remaining supiasegmencaJ proeems are a-so straightforward- For ex- 
aimpic, the melody plane represents the segmental domain of cijiii-tonaliLy, 
This is, of course, an informal proposal intended Lo illuminate the relation¬ 
ship between the autosegmentaj proce-sSus and the segmental string, and to 
suggest 4formalization of autusegmuntaJ phonology. 


Chapter 3 

Syntactic Agreement and 
Lexical Ambiguity 


la this chapter, we consider the computational process of computing the lan¬ 
guage user's knowledge Of syntax- Knowledge of syntax includes knowledge 
of syntactic dependencies, such as agreement or selection, and knowledge of 
purely syntactic distinctions, such as noun/verb or singuiar/jdujtd. Syntac¬ 
tic dependencies are defined with respect to the syntactic distinctions. That 
is, we say IL a verb selects a noun phrase. 1 ' or “th* BOUO and verb agree on 
number,** An adequate linguistic theory must represent these dependencies 
and List the set of possible syntactic distinctions. 

By M.y account, syntactic dependencies are complex, involving the interac¬ 
tion of local and nonlocal relations. Seemingly local decisions, such as the 
disambiguation: of a particular word, can have global consequences. This 
suggests that it may be difficult to assign a structural description to a se¬ 
quence of ambiguous words. In order to- translate this informal observation 
into a formal proof, we must define the problem of assigning a Structured 
description to a string of words. 

We immediately encounter two difficulties. The first is that no one under¬ 
stands what it means te successfully comprehend an utterance. As discussed 
in the introduction, it cannot mean to find exactly the structural descrip tion 
in the head of the speaker, because this may not be possible. Nor can it 
mean to find an rue structural description for the utterance, because thin is 
the trivial language miscomprehension problem. In short, it is not possible 


41 


to define the LC problem for syiltauc without a. simple eh&riCtMiZAtiOn of 
the class of appropriate structural descriptions. In. order to overcome this 
obstacle, we must define ou: problem so that it is. a natural subproblem of 
aitv reasonable statement of the language comprehension problem. 

The second difficulty is that, unlike the anaphora problem, considered In 
chapter 1 r the class of structural descriptions for syntax does not have a 
sample theory-invariant (that is. direct) characterisation, There are a wide 
range of competing syntactic theories, and they differ significantly. It ts 
po@s:ble to broadly distinguish two classes of syntactic theories. 

• In unification-based theories, such a$ lexical-functional grammar or 
generalized phrase structure grammar, atomic features represent pos¬ 
sible distincLiohS. Syntactic dependencies are all stated in terms of 
one simple mechanism: the unification of -uniform sets of features be¬ 
tween a phrase structure node and its immediate ancestor, children, 
or siblings. For example, subject-verb agreement is implemented by a 
chain of local unification: the Subject noun is unified with ihe subject 
NP (its ancestor); the Subject, with the matrix VP (its sibling); and 
the matrix VP with the main verb (its child). 

* In current transformational theories, possible distinctions are repre¬ 
sented by features and morphemes. Syntactic dependencies consist 
of particular Linguistic relations, such as predication, selection* and 
theta-role assignment.. They are defined primarily in terms of local 
phrase structure configurations at diffcTffllt levels of representation. 
The mapping between different levels of representations is performed 
by transtormations. For example, s□hject-verb agreement results be¬ 
cause the subject specifies the features of an agreement morpheme; t his 
morpheme is subsequently combined with the Verb root morpheme at 
a different level of representation. 

£n order to overcome this obstacle of competing theories and achieve an 
invariant analysis, we must first define the language comprehension uroblem 
for syntax relative to the linguistic theory, and then analyze its complexity 
for both classes of linguistic: theories. 

Our subproblem of choree is the lezical resolution problem {LHP; for a given 
syntactic theory: Given a partial syntactic representation R that yields n 
string of ambiguous or underspecified words, and a lexicon L containing 


42 


ambiguous words, CS-tl the words in R be found in the lexicon L? This 
i :ii : 1 1 !:■ 1 1 1 stat.p-rnf’.nt overcomes the i : difficulties. r l f defined relative to 
the syatactic theory, and the Language user must solve the LRF in order to 
find an appropriate structural description. 

Unihcatlon-based theoues are very similar from U formal perspective. And 
bctatiie ail syntactic dependencies are stated in tenus of a-simple, uniform 
mechanism (feature unification), it has been straightforward to prove that 
the LRF for these theories is NP-hard (fijstad and Berwick, JSfi&j- In this 
chapter, we prove that the LE.P for modern transformational theories is also 
KP-hard. Doth proofs roly on the particular details of some linguistic theory, 
But the combined effect of those two results for the LRF is to argue for the 
NP-hardnesS of the ''true" language comprehension problem for syntax, 

The chapter is organised into four sections. The first section introduces 
the structural description* of current transformational theories, with moti¬ 
vation, In section 5,2, we prove that the LRF is fJP-haid for these theories, 
under a very abstract formulation, Next* section 3,3 shows exactly how this 
abstract formulation applies to the “Barriers 1 ’ theory ol Chomsky {1 [1.(36). 
The conclusion discusses the central role that locality plnys. in transforma¬ 
tional theories, and the consequences of allowing uniform mechanisms in the 
linguistic theory. 


3.1 Morpho-synt actic dependencies 

The syntactic structure underlying even extremely simple constructions can 
be quite intricate. Consider the simple passive expression ( 1 ). 

(1) John was seen, 

What is it that English speakers know about thin expression? For one, they 
know that the expression describes an event that occurred In the past- This 
information is contained in the verb HUM, which is overtly inflected for the 
past tense as well as overtly agreeing with the surface subject on number. 
One way to represent this knowledge 5s to say that the overt form teas 
underlyi ngly consists of the three morpheme* [be], [past], and [singular]. 

English speakers also know that the expression is in the passive voice -that 
the overt subject John is subjected to the ’’seeing action * 11 and therefore 


43 



ffP Aeri 


j 1 u i\ n | 

Afr T-cutP 








Tr.iifn 





L /\ 


Vcm VP 


[pauivel 

y TCP 



f 


Figure 3,1: The underlying (EiajUnl) structural description of John ujo£ stat. 


stands in the same relation to the mb see in (1) as it do*iS in the corre¬ 
sponding. active expression 5ornconc saw John. That k, (a) the verbal form 
Iwn consjstB of the the i'eih root, [set] ujid the voice morpheme [passive], 
which happens to be realized as the +cn suffix: here, and [it) John is the 
underlying direct object of the verb xtr in both active ar.d passive variants 
of the expression. 

In order to represent this knowledge that language users in fact have about 
such expressions, we assign eEls partial structural description depicted in 
Figure 3-1 to the utterance (1), where each surface word has been exploded 
into its underlying morpitemes. 

The morphemes am organised hierarchically according to X-bar theory. X- 
bar theory slates that morphemes of type X project into syntactic co-m 
stituc-nlE of type X. Thai is -simpEy to say., for example, that a verb phrase 
must contain a verb and a noun pEirase must contain a oo-un. The relation 
between a morpheme X and its complement (a phrase YP) is represented by 
the sisterhood configuration inside the first projection of X. [2], 


■1 I 




xu VP 


Selection if> un instance of complement-ation, For example, the Aspect mor¬ 
pheme- [be] selects the voice morpheme [passive], which accounts for the 
devianey of the minimally different expression Joi'in ti'cs see. 

The relation between the first projection of X s-nd its specifier (a phrase V P; 
is represented by sisterhood in the second projection of X [13]. 



YF si 


The second projection X2 of the morpheme X is a phrase of type X, also writ¬ 
ten jXP, Agreement is an instance of specification. For example, the proper 
noun John specifies the agreement morpheme [singular] in the underlying 
structural description in figure 3.1. 

Finally, the underlying thematic relation between the verb •trie and John is 
represented indirectly, by postulating a trace tf that is selected by the verb 
see fa trace is a phormlogi rally Elleni placeholder} and is assigned the same 
index aa element whose place it is holding (Jo&rfc). 

In assigning this representation to the utterance (L), we were guided by the 
principle of universal explanation (UE-). I'E states that there is only one 
underlying language, from which particular languages -differ in trivial ways. 1 
One consequence of b : E Le than jf any language muhos an overt distinction, 
then all languages must :nahc tbit distinction in their underlyingly repre¬ 
sentations- For example, in languages such as Hindi, verbs agree with their 
direct object and their subjects (Mahfcjan ISfiSh therefore object and sub- 
ject must both appear in the apedfier pcE-rlicm of agreement phraseft in all 

L Tlie principle af nm^rad explanation la * particular tkeary uf wt.aL canBtilUWs uhs- 
vetaaJ grammar, dial la. a theory of din innate nDiiawinsut ol Lie language us*t, It is 
fundamental In- tht study of Lin r linen, For h is tor Leal -KXBHtpIt, SC-G tli-0 infliiiftptjal w-nrk 
af James Hea'.ti* (17S-S), especially hie andysis of tense, And in his award-winning Jfl-2? 
essalF *11 tk« unein oi laager, Johann Gottfried von ELeider lays, '‘tVsio can (whatever 
W may have to say upon, tli* subject) entirely deny the fund a meat a] -connection, cxi#t- 
kRbntivnnn mosU Lari gu a flex 7 There is Sint oc.e human race Upon the earth, apd but one 
tangunje.' 1 (p.113) , 


40 



\ second consequence of TJE is that all clauses have the same 
underlying structure. At the eery least, a clause must COilUin a subject, a, 
tense, and a verb. And because it contains a subject, il must also contain 
an agreement morpheme. 

^Ve were also guided by the goal of representing linguistic relations uniformly, 
via local phrase Structure configurations. So the selection relation between 
sc verb V and its underlying direct object XP is always. represented by the 
complement configuration "Wi v XT ]-" 1 And when the direct object appears 
as the surface Subject, as in the passive., a trace ie used as a place-holder. 

Now consider the expression (4). 

(4) lorn saw Mary yesterday. 

If we examined certain cross-linguistic facts, and obeyed the principle of 
universal explanation, we would assign the underlying structural description 
in figure 3,2 to this expression. 

The verb see selects the proper noun Afa-ryas its direct object, resulting in a 
\ 1 projection, This Vl predicate is specified by its subject, the proper noun 
Tom. The relation of modification between the resulting VP and the adverb 
yesteriffop is represented configurationally us adjunction to YF. The remain- 
ing morphemes—object agreement, verbal tense, and subject agreement— 
appear in. this structural description, but have not yet been Specified. This 
is indicated by the empty categories "[e]” in their specifier positions. “C 71 is 
the complementizer morpheme, which is phonologi tally null in declarative 
expressions. 

The underlying representation in figure 3.2 undergoes certain movement 
transformations, resulting jn the surface form in figure 3.3. 

First, the underlying object Afftty moves to the Specifier position of the 
object agreement phrase, so that the agreement morpheme will he Specified 
[.singular^ and so that the object Affirywill be assigned objective case. Nejtt. 
Tom, t ho Underlying subject of the verbal predicate [see Aftiry], move* to the 
specifier position of the subject agreement phrase, in order to specify the 
agreement morpheme as ^singular] and be assigned nominative case. Finally', 
tlLe verb see combines first with the object agreement morpheme, then with 
the tonse morpheme, and finally with the subject agreement morpheme. It is 
spelled out as st i«j, Each movement transformation leaves behind an indexed 
trace. 


46 


CF 

/\ 

1*1 Cl 

/\ 

CH **rF 

/\ 

|*| ■^g r] 

/X 

Ajr T'^uai-P 

1 /\ 
[dhE uljr | f X 

T«nat Af^PIO) 


! 

Ipjlt] 


/\ 

H A«l 


/\ 

LOT VP 


Agr 

I 

jsinguljr] 


/\ 


AdvP VF 

1 

yr lirriiay 

NP Vl 


/\ 


I 


Jotti 


Y NF 

I I 

jn jt j'u r j, 1 


Figure li,2: The underlying alrueiura; description oF Tfrrci saw Afnry jtaierefaj/, 
wriiich is a partial representation cf the Language usct's kTinwIedgi? t?f morphological 
dcpcndcn-rl-es. 


4 “ 



cr 


CD 


x 

lej Cl 

/X 

AErftS] 

X 

Tara. Anri 

X 

Art T'frjitP 

X 


l"-'n e. Trait j 
IpUAj ART; 

l^fl v f 


Tilth: AglffO) 

!, X 

Hirr, Apt 

X 

Air VF 

!, X 

AdvP VP 


I 

JtltFFjAjf 


X 

iP VI 


yp 

I 


X 


np 

1 

t. 


f igurtC" 3.3: The surface foriib of T^nq ssui Mary ir.ficniay, the result of repeatedly 
applvihg movemeiLL tcADsformacbus In the undertying fnrm in figure 3 2. 


4S 





TUa analysis is motivated by ChuMMky { I58$J3S3) and Pollack (1&89)- - be 
movement transfuiinalitMIS proposed in that work art Mnsideialily more 
cornplex than those shown here. 

By explicitly representing, the dependencies between the morphemes in this 
fashion, a number of Things become dear. For one. each morpheme typically 
interacts with Only a lew other morphemes, such as its, specifier and its 
complement. However, because cadi word consists of severs] morphemes, 
W*ty word Interacts with every other word in a clause. 

Reconsider our hist example (1). In that example, the passive verb form 
Ecc+cn. selects the underlying object ij and assigns it a 'patient thematic 
role. The underlying object Ii appears as the surface subject John\ the 
subject agrees with the inflected aspect fe+po.sC+sj'tiaoJ'ar, which assigns it 
nominative case;, and. to complete the drcle of interactions, the inflected 
asp c ct ti'os selects the passive verb form. These properties of words, such 
as ease-marking, thematic Tole assignment, selection and agreement, are all 
independent, not directly dedueikie from the phonological form of the words., 
and potentially correlated in the lexicon. 

It is easy to see that interactions among the words in. a sentence can become 
extremely complex. Imagine that the lexicon contained three homophououi 
verbs—ae^j., setfj, and sees—with the same phonological form but different 
selection al restrictions- Then verb phrases could encode satisfied 3-CNF 
clauses: rcej would be false and select a true suhject; seej would be false 
and select a true object; and scc-^ would be true and select a subject and 
object of any truth value, The consequence is that any verb phrase headed 
by see. mist contain a word representing a true litoral- Wo could even get 
two litorals of the same variable to agree on their truth values by mowing one 
to the suhject position of the other, where thoy must agree, exactly as in the 
passive construction; the underlying object moves to the subject position, 
where it must agree with the auxiliary verb. Then if words wore Boolean 
litorals, it would be possible to encode liSAT instances in sentences. The 
proof in the next section formalizes this intuitive argument- 

So far our discussion of the language user’s kmjwied gje of syntax has concen¬ 
trated on knowledge of syntactic dependencies. Let us therefore conclude 
tins section with a brief discussion ef the range of possible syntactic distinc¬ 
tions. There are two broad classes of distinctions with syntactic effects, the 
purely syntactic and the semantic or pragmatic. 


Many sem antic am) pragmatic distinctions—such aa animate/inanimate or 
abstract/concrete— have syntactic effects. As Chomsky (lOCSTSlTj Las oh- 
served, this can be used to account for the dcviincy of certain enpreatioDS. 
Pot example,, the contrast sincerity may frighten the bog and the 

bt>y may frighten sincerity it, accounted for by the fact that frighten selects 
animate objects. 

Purely syntactic distinctions are those distinctions that are independent of 
meaningful extra-linguistic distinctions. 1 For example, the syntactic dis¬ 
tinction among masculine, feminine, and. neuter does not correspond to bi¬ 
ological sex; nor does singuJar/plura! correspond to physical or perceptual 
iodivisibility. Nor do nouns denote things, or verbs, actions, as has been 
very wittily argued by Thomas Gimtflf Browne in 17M. This class of purely 
syntactic distinctions includes morpheme class- {noun, verb, *4j*ctive, and so 
forth], so-called agreement features {gender, flu tuber, person, kinship class, 
etc,), case, grammatical function, thematic role, and soo-n. These distinc¬ 
tions vary from language to language, and are al] treated uniformly by the 
linguistic theory. Syntactic distinctions appear to originate from semantic 
distinctions, hut soon lose their connection to meaning. If this is so, as 
linguists have argued it- is, then we have good reason to- believe that the 
number of syntactic distinctions is unbounded in principle, limited only by 
the amount- of time it Lakes the language user to acquire a new distinction, 


3i2 Complexity of linguistic transforms 


In this section we prove that the LRP for modern transformational theories is 
NP-hard. The idea of the proof is quite similar to the proofs of lemma 2.2.1 
and theorem 5 above. As in those proofs, we will construct a structural 
description that enforces the 3-SA.T constraints. Ambiguous woTds will play 
the role of elliptical speech. Thu words in the Structural description will be 
ambiguous with respect to a syntactic distinction that corresponds to truth 
value. 

Recall that transformation theories postulate at least two levels of syntactic 
representation—the underlying and surface representations— and A mapping 
from underlying form to surface form. The underlying form is called the D* 

'Tkc dtfi.ni.tkui of what is mc| isn’t a syntactic riislmCHOU is ut course- eiiliiety tfceorj- 
inLfirnl. However much 46 there is uy 4grt#Hl*uE, syntax Includes LbaL per Linn D [ Lin. 
jtuistic totHfi that is Id^iciUv Lalependc-n t of teal-voorld nt-aning or jdivnoloev 



structure (DS) and the surface form is called the S-atnK.ture (SS). DS is 
a represCntati<Wl (if thematic rok- sssipiraenU selection, and grammatical 
function fsubject, object, etc,), S3 is a the syntactic representation closest 
to the surface form of a-sentence. En current transformational theories, DS 
is mapped onto 33 by the generali Ecd move-a transformation. 

The idea of the proof le to Emulate the 35AT constraints wills it complex 
syntactic representation, that we will build using one simple building block, 
called a ""stair,"' 

Definition. A stair is an underlying form U-i with the following structure: 

1, fleet! mwe .timciurc. [i r , contains anot her stair E/s+ 1 - 

2, 5ki(£tion and agreement nre correlated. U, contains a morpheme pi 
that selects f. r ,+ n ,. Loral affixation rules wit! rciorpholugi tally merge 
the head of Ui with the morpheme jj,, thereby correlating selection^] 
properties of tu with the agreement features of Ui in the lexicon. 

3, Undergoes obligidnfy monemenf. U, selects and assigns at theta-rale 
to t/j+i, but does not assign it case. Therefore fJV+i Is a properly 
governed argument that- undergoes Obligatory movement in order to 
satisfy the case filter. (The same will be true for U,-) 

4, Tbawpartnt to tfEfnsciiort, U-_ allows nodes that can he moved out 
of E^ +i to also be moved out of (Tills kind of long movement is 
typically done by successive cyclic movement between hounding nodes 
in order to satisfy the s-uhjaccny condition of bounding theory.} 

5, Coniasus cj fanrfircp Site. Ui contains a specifier position that is assigned 
case. The head of Ui will agree With Sts specifier; therefore only stairs 
Lhal agree with the head of U, can be moved to this specifier position. 
(Correspondingly, this means that Ui can only move to the Specifier 
position of a Stair f.'j, J < t, tbit agrees with it.J 

Recall the 3SAT constraints on page 35: (i) negation: a variable x s and 
jts negation IF; have opposite truth, values: (ii} clausal satisfaction: every 
clause C, = (e s V b t V c;} contains a true Literal (a literal is a variable or it-s 
negation]; (hi) consistency of truth assignments: every positive literal of a 
given variable 5s assigned the same truth value, either 1 or 0.. 

Lemma 3.2.f Stairs can enforce Ike 35AT eonstrairdt. 


51 


Proof, The idea of the proof is to represent negation as a morpheme; to en¬ 
code the truth values of variables, in syntactic features; to enforce clausal sat¬ 
isfaction in tiie underlying rejMeSGUtalicn (DS] s using sefoctional constraints; 
and to ensure consistency of truth assignments in the surface representation 
(SS)j using long distance movement and sp-edfier-head agreement. 

The DS consists of one stair per formula literal, which is three stairs per 
formula, clause, Let the clause Ci = (a.-vfr, VC;) be represented by the three 
Stairs and U^’- 


(5) 


The seleciional constraints of tin; three stairs ensure that each 3-clause con¬ 
tains at least one true literal., although Lexical ambiguity will prevent us 
from knowing which literals an the 3-clause are true. To do this, the first 
stair L r j^ must promise to make Ci true, either by being true itself or by 
selecting a stair U iif> that promises to make the 3-clause true; to fulfill its 
promise, the second stair L r ;,s mast either be true or Select a true stair U, 
(If EAp.a is true, it selects the next stair U,j, with either truth value.} This 
chain of selectionai dependencies is shown in (ft). 



((3) 


(ruT 


< t f u* imt 

fink #»!<*— 


txui 


Affixes listed in the lexicon will negate or preserve variable truth values, ac¬ 
cording to whether the corresponding formula literal is negative or positive. 

Then, scanning from right to left, each stair is moved to Lhe specifier position 
of the doaest stair of the same variable, either hy tong movement Or by 
successive cyclic movement (ret! figures 3-5, 3 C), 

In the resulting SS, the specifier position of the stair that corresponds: to 
;'th occurrence of a given variable contains the stair that corresponds to the 


52 




Figure 3.4: Du the input 3SAT inElantp; / ~ ft], fa „ sfj), Ea, ia), tb* DS in 
the figure is created to represent /. Each LileraJ in / is represented hy a slaii 
construction. For cuunple, tJw. first literal oF the first clause. 5Js n is represented by 
the outermost stair construction, £efeettonal constraints arc- Miifurced at, D£. 
They ensure that every Soulead clause contains a true literal- 


t + ltb occurrence of the same variable. These two stairs agree with caeli 
other by specifier-head agreement. Now all the stairs the correspond to 
literals of a given variable arc contained lit the specifier position of the stair 
that corresponds to first occurrence of that variable (sec figure 3.6). 

Now all vaurJa-hlos hive consistent truth assignments* by specifier-head agree¬ 
ment at SS. All clauses contain a true literal by DS selection. Negation is 
perforated hy affixes. The formula is saiMfiahle if anc only if the correspond 
lng DS and SS are well-formed, □ 

Using the construction in lemma 3,3.1, and the fact that words may be am¬ 
biguous t wc can now prove the following theorem about the lexical resolution 
problem: 

Theorem fl 1'h.c l.H^ in NP-kani an models Uw-t pern# u ffiatf- 
Proof, Hy reduction to 35AT. The input is a Boolean formula / in 3 CNF; 


i3 







Figure 3 . 5 ; This- figure depicts the first movement transformation that is applied 
m the mapping of the Di3 in figure 3.4 to the SS Ld figure 3-6. The Lunermoal stair 
(^i,c ^presenting the last literal in /) nvpvce to the specifier position of the third 
^t^ir W\^), leaving behind * trace ( Jc . Thus movement tranafcumaLioji relates the 
ia literal of the second floolec-u clause to the x 3 tUeral of the first Boolean clause.. 
Xow both stairs agree, by specifier-head agreement; therefore, the eorrwpondjng 
Literals ai the formula variable t 3 will be assigned the same troth value, even though 
they appear in different clauses. 



Figure 3.fk This figure shows the SS that results from repeated!} 1 applying mnvr- 
ntent tranafoisnatiofis to the DS depicted in figure 3.4. Specifier-bead agreement 
is enforced at 5$. It ensures that all instances of a variable aje assigned the same 
truth value. 


5d 









the output is a lexicon I and a, structure S containing (indefsperififid fiords 
such that the words in S can he fanincl in L if and only if / is sitisfiable. 
The structure S will be bcnlt from / according to the st.air construction in 
lemma 3.2.1- Two stairs will agree if and only if they correspond to Literals 
of the same variable- and have been assigned the same truth, value. The 
words in the syntactic Structure will, f>e ambiguous OEdy in the syntactic 
distinction that corresponds to truth value- One agreement feature encodes 
variable truth ass ignm ents., and another identities ltoolean variables. One 
non-agreement feature encodes literal truth values, and & second one keeps, 
track of the promise in the chain oi scdcctionai dependencies shown in (G). 
The stair construction ensures that the 3SAT constraints aie satisfied by ail 
permissible lexical choices fur the words- [~] 


3.3 Complexity of agreement interactions 

A central question far current transformational theories of syntax, such as 
Chomsky (1386) and Laanih and Saito (1951), is what are the consequences 
of interacting agreement relations, such as Specifier-head agreement, head- 

hewd . ..erl. 1;i-sh.il pm;i-riin:i nfrreemsr.t. ar.d the various farms nf chain 

agreement (link. extension* composition.]? 

In this Eaction, we reduce this broad question to the narrow queEtion: cun 
these transformatiemai theories simulate the Stair? If yea, then we have 
proved that the LRP for those theories is KP-hard. This,, in turn, will ,$ve 
us reason to believe that the interaction of agreement relations can be quite 
corny lux in these models. 

Lemma .1.3.1 Comers- allows n fltair. 

Proof, The noun complement structUJe depicted in figure 3.7 is a stair 
according to the Barriers model of Chomsky (10S6). {The definition of a 
stair appears on page SI.) 

]. fiec-wr.wsifi sirwefioie. NP, contains NPy+i, the next Stfur. 

1. Sekction oud agreement arc correlated, NPy contains a verbal mor¬ 
pheme VO that selects NP, +[ . VO undergoes obligatory head move¬ 
ment io the Inflectional element 10, creating an inflected veth- tit the 


. 1.1 



FLpiis 3-7.: A stair construction for Line Barrier modeJ oF Chumsky (ISlSS), This 
is i he piitase structure that would be assigned Lo noun complement ccmatj-gctionB., 
siich m desire to tint places. 


tiead of IT . The ^-features will appear on the inflected verb by specifier- 
hea-d agreement, where they may be systematjcaliy correlated with the 
verb's selection^ properties in the texicffltl. 

3. Underfees obligatory oinisetitent. VO selects and assist * theta-role to 
atfP 1+ .]. but does not assign it case. Therefore NP,+] must move. This 
is possible if VO hat lost Its ability to assign case (passive morphology] 
or U [s’Fi + ] is the underlying subject ai VP Ll as [a currently popular 
VT-interntd Subject analyses, 

■1 Thin^awiJ to extraction. In Barriers, blocking categories (BCs) stop 
unbounded application of move-a. Informally, a BC is a category not 
theta-marked hy a lexica: Xfl. For example, matrix verb phrases are 
DCs because they are selected by ike noiuexkai category 10 (inflec¬ 
tion.] without being assigned a theta-role- Unbounded A-movement 
becomes possible when a category is moved local steps^ adjoining to 
intermediate non. argument positions before moving on (adjunction is 
typically to UCs)- 

In our noun, complement construction {figure 3.7). !?!?,+] can he moved 


50 




Out of NP;, VP js, a. DC and a barrier for NF,+i hecause it ls not L- 
marfced, but KP.+I can adjoin t£? the nonargument VP and void its 
barrierho&d because nonargumenls may be freely adjoined to. Both 
KP; and IP, are L-marked, and therefore are neither BCs nor barri¬ 
ers for further MP,** raising. Thus, NPv+i can be A-moved to any 
c-commanding Specifier-oMP position - cl without violating tbe E-CP 
because ;:u! traces are properly governed (hoth theta-governed by the 
vv-rb V that ejects NP,+i, tmd 7 -markftd (antecedent-governed) by the 
deleted trace adjoined to VE : ')_ 

Reinhart (pc) Suggses a similar, aibeil marginal, naturaJ esample 
where an NP containing an argument trace is topi taliped to t’F Spec¬ 
ifier fronr an L-marked position: 

■' • 1 . ?[WJiat burning f :+ ]], did John say [of what book], + i [i; would 
be magnificent] 

b, What burning]; did John say [jr, of what booh] would he mag¬ 
nificent] 

Chomsky (pc) Suggests that the proper analysis of(7) is (7b), W(i that 
a better topiciliswdicm example is (8-), 

( 6 ) What burping did John say [that) of that book, Mary thought 
would be magnificent, 

5. Confirms o landing site.. The internal IP, contains a specifier position 
(Landing site) that will agree with 10 by specifier-head agreement in 
nonlexical categories; the specifier position will also agree with b "0 (the 
head of NP,), by predication. Alternately, head movement from VO to 
10 to Nfi can create an inflected noun “[[V I] N] B in the XO position 
of NF; that will agree with the landing site. Although it is difficult 
to find a natural example of such an inflected noun, no arguments 
or analyses exclude it in principle. A close natural example is norm 
incorporation in Mohawk verbs (Uaker 1 D85:13U). 

Tbis establishes that the noun complement construction in figure 3.7 is 3, 
stair in the Barriers model- I I 


Theorem 7 Tht LRP is .\'P-hard in Barrier*' model. 


Proof. The proof follows from lenna 3-3.] and theorem b above, The 
lexicon Contains ambiguous inflected nouns 41 [[V IJ N]" that have undergone 
verbal Incorporation- Q 

Ristad (1333^22-28) contains a direct proof of this theorem with all the 
details explicitly worked out. 

Theorem 8 The LRP is NP-hard trt lhe LoEnik-Saito model. 

Proof. The preceding proof proceeds without alteration in the Lasnih-Sailo 
(19Sd) model because in that model, theta- govern meet suffices for proper 
government, and traces may he deleted after 7 -marking. □ 

How might we change the Barriers model in order to block the preceding 
reduction? 

The preceding proof relies on long movement of the NP complement of a 
verb [in a [loud complement construction), which is precisely what Barriers 
Strives to prevent by reducing proper government to antecedent government, 
using the Lasnik-Saito y-marking mechanism. (The commitment to elimi¬ 
nate theta-government from the definition of proper government is tentative 
at best. The strongest position taken is '‘Possibly, a verb does not properly 
govern its C-marked complement ,* 1 p.79.) In the Barriers stair constructions 
an argument undergoes long movement by adjoining the argument NP to 
VP, 7 -marking its trace, and then deleting the intermediate >T-trace at LF. 

This is the exact derivations] sequent* (adjoin., 7 -mark, delete adjoined 
trace) used in Barriers (pp.21-22) to move a wh-sobject from a theta-marked 
CP complement to a specifier of CP, provided the wh-phratt is licensed at 
LF. Harriers attempts to exclude similar Jong movement of an NF from a 
similar [but caseless) subject position by appeal to Binding Theory condition 
C at S-structure; the NF Eiac* in Subject position wouid be an vl p -boillid 
Et-expression A-hn-und in Lho domain of the head its chain (p.# 3 , In,20). 
(Barriers differentiates the two constructions solely by the nature of their 
traces; wh-traces are not JI-expressions, while NP-traees are.) 

Crucially, condition C will exclude long, movement of NFS only if trace dele¬ 
tion is restricted to LF. Otherwise, adjoined traces could be deleted before 
causing an S-atrurttLre binding violation. Bat trace deletion cannot b* re¬ 
stricted solely to LF. If it were, then any ECF violation created by L-F- 
movement may be avoided, simply by deleting eflfeadxng intermediate traces 


after they have done their 7 -marking duty, This can he done because ad- 
join-fr-d /l r -traces are not required by the extended projection principle at LF. 
This is why neither Barriers fLOt Lssuik-Saito in f&ct restrict trace-deletion 
to LF- Therefore, long movement of an NP js not excluded in these inodds, 

Thc-te is another conundrum. The long movement used in the proof is 
applied cyclically, so that the trace ef the argument KP is no longer c- 
commanded by the argument h'P once all movement has- applied, and hence 
is not A'bound by the head of its chain at S-Structure. This violates the 
c-command condition on chain links, but such violations are standardly ig¬ 
nored in the literature* and therefore do nci'l raise any special problems 
here. Structures where such violations are ignored include the topkaliza- 
tion example {7) above, antecedent-contained ellipsis (0u) and passive VP 
topical is? ation in English (Ob), 

a. [Everyone tliat M&X wants to e 2 j 3 John wild [kiss CiJ 2 

b, [[vp Arrested U by the FMjj John, ha* never been tj] 

Pin ally, even if trace deletion were disallowed entirely, long movement would 
still he possible from theta-marked noun complements f and the proof of 
lemma 3,3.1 would proceed, because: thyta-governnient cannot be eliminated 
without negative consequences in the rest of the theory. 

Proper government can ho reduced to antecedent government only if an¬ 
tecedent government suffices for NP-movement (eg,, passive and raising) in 
accordance with the chain extension operation, This fails because only the 
terminus of an (extended) A-chain may theta-mark or case-mark, in order 
to obtain the CED effect (Condition on Extraction Domains, see Barriers,, 
p.72), Therefore, in passive constructions, where the A-chain headed by the 
subject jN'P must be extended Lo include- the verb and inflection and thereby 
achieve antecedent government of the NP-tr&ce at S-structure, tho inflection 
will simultaneously loose its ability to- case-mark the subject position, The 
diced consequence is that both passives in (ID) violate the case filter and are 
ungrammatical ill Batriere without theta-government, although only (10a) 
should be excluded, 

' "^‘a. *|e] was killed John 
b, John was, killed t 

In short, the chain extension required to satisfy the KCP without theta- 


09 


government w iU prevent lbs subject tvP from receiving cast-, and thereby 
violate the case filter. This Open problem may be remedied by abandoning 
either fj) the C£t£i: filter, which would without question bo disastrous for 
'lho theory, (Ll) the Barrier?, analysis of CED effects, which would reduce 
empirical coverage, or (iii) the coinffeKing/chaift extension analysis of NF- 
BBVtfflint, which will have the. direct Consequence that proper government 
cannot he reduced t& antecedent government. 

The possibility of Long distance argument movement by adjunctccm. to inter¬ 
mediate position? remains in more lecenc proposals based on the Barrier? 
model. One such proposal, due to Chomsky (IMS), Is that derivations he 
subject to a "least effort principle," with the following provisos, LF per¬ 
mits cmiy t he foil owing five elements ■ argu mentis, adjuncts, lexi cal elements, 
predicates, and op eraJior-variahle construe lions. Affect-alpha must Apply at 
LF to each illegitimate object to yield one of these five legitimate elements. 
Chomsky {13^0:63) urges us to '"consider successive-cyclic A-bar movement 
from an argument position. This will yield, a chain that is not a legitimate 
object; it j* * "heterogeneous chain,- consisting of an adjunct chain and an 
(A-bar, A j pair (an operator-variable construction, where the A-bar position 
is occupied by a trace). Tins heterogeneous chain can become a legitim ale 
object, namely a genuine operator-variable construction, only by eliminating 
intermediate A-bar traces. Wt conclude, then, that these must be deleted 
at the point where we roach LF representation. 1 ' 

A direct consequence of this theory, then, is that successive-cyclic A-bar 
movement from a theta-marked argument position to a case-marked argib 
m*nt positron will also yield an illegitimate object, that can become a le¬ 
gitimate object, namely an A-chain, only by eliminating intermediate A-bar 
traces al the point where we reach LF (that is, before LF chain conditions 
apply). We conclude, then, that these intermediate traces must be deleted 
at that point, and that long distance NP movement is permitted by the 
theory, 


3*4 Conclusions 

3.3.1 Locality in linguistic theory 

The guiding idea behind transformational theories fc to map each linguistic 
relation into a local phrase structure Configuration at some level of 


BO 


representation. When this is not possible, because t and y ate not proximate 
at any level of representation. then the relation ff[ j, jr} must be broken into 
a chain of local relations R(x,ii J, ti),. . -, il{f n , using intermediate 

elements i,\ 

Locality in linguistic representalions is a way to describe complex interac¬ 
tions with, intervening dements, When a nonlocal relation I is broken 

into a chain of local relations, then an element £ that is on the path between 
it and y can affect the nonlocal relation,, by interacting with one of the inter¬ 
mediate positions in the chain of local relations-. For example, ItiWt-O Is an 
operation that induces a ’'movement" relation between the moved element 
and its trace. This operation is constrained by subj&cermy and by the ECP- 
As A consequence, unbounded chain dependencies can arise only from suc¬ 
cessive cyclic movement, which constructs intermediate traces- If some step 
of the movement is blocked, as when an intermediate landing site is filled or 
there is a barner to movement, then the nonlocal movement dependency is 
blocked. 

Locality, then, is only a constraint on the way relations are described, Any 
nonlocal relation c&ji always he described &S a chain of local relations. In 
fact, all recursively enumerable sets have Local descriptions by definition,, 
because the idea of an, "'effective procedure* is one that consists entirely of 
simple loc-al operations, too elementary to be further decomposed (Turing, 
LOG'S; Minsky, 19C9). in short, ""Locality* has no empirical consequences and 
cannot be falsified- 

01 COUfSe, a particular linguistic constraint, that happens to be slated in 
terms of local configurations, can always be called a ''locality constraint." 
And particular constraints may in fact be falsiJiahfe. However, no empirical 
evidence can distinguish the way in which a constraint is slated, that is, 
in terms of local or nonlocal configurations, and therefore "locality* is rot 
itself A source of constraint, 

A C-ASe in point is the intermediate traces of Bamers-lype theories, whose 
only apparent purpose Is to allow lh« iterated local description of nonlocal 
BIMUnt relations. Intermediate traces result, from a particular conception 
of the ECP as a loco! bound on move’Cr, There ]s no direct empirical evidence 
for the existence of intermediate traces. Nor is there indirect evidence,, 
simply because they do not interact with other components of the grammar. 
For example, they might have binding or phonological effects. Adjunct 
traces may satisfy the ECT only via antecedent government at LFt as a 


consequence, adjunct extraCTiom results in intermediate traces that may not 
be deleted at S5. Thus, lire only intermediate traces required at SS are 
the traces of adjunct extraction, but these non-case-marked traces, do not 
bioek icunf+fo u<cmnu contraction, which is only blocked by case-marked 
elements (Chomsky Fur example: 

(11) How do you wanna solve the problem? 

As expected, the inlermedrate traces in specifier of CP and adjoined to VP 
do not block phonological contraction. Neither do Lhese intermediate A ( - 
tfaces affect binding relations, whose domain is NPs in A-poeUions: 

(12) [which woman], did John [ V p t dream [cp if [ip Bill [ 10 [ VP if 
[VP “™ { 4 herself } ^ ™ ih ** 

The governing category of the direct object is IP (til* complete functional 
complex), and therefore the c-commanding trace t] adjoined to VP could 
hind the auaphor in object position within its governing category, if the trace 
were in an A-poStion. But. as expected, herself is in fact unbound, which 
strongly suggests that tj 1* only relevant to the computation of nonlocal A*• 
movement as constrained by the ECF. The precise formulation of the ECP, 
ana the existence of the intermediate traces It requires, is the topic of much 
active research and debate. But the fact that these intermediate; truces do 
not enter into other syntactic relations casts doubt cm their explicit syntactic 
representation, at least in my mind. 

Finally, locality has no Eogicai relation to explanatory adequacy. The Linguis¬ 
tic theory that best satisfies the Locality requirement is generalized phrase 
Structure grammar, lo GFSG, all linguistic relations axe reduced to maxi¬ 
mally local relations between mother and daughter, or among sisters. Rela¬ 
tions may extend beyond immediate domination only by iteration, whether 
from lexical head to its projections, or from gap to filler in a unhounded 
dependency relation, Because all relations arc uniformly represented in syn¬ 
tactic categories, many formal devices may interact in constraining the dis¬ 
tribution of a given linguistic relation. This, when coupled with the iteration 
of local relations lo achieve nonlocal effects, can lead to severe computational 
iutractability^ the universal recognition prohlem for GPSGs can take more 
than exponential lime (filstad, l«f!fi). More importantly, there are infinitely 


62 


many unnulurai GP5G languages, including fictile, regular. and context-free 
languages, Thus, the linguistic theory that most dosely E-atisflet, the locality 
requirement lacks boih computational constraint and explanatory adequacy. 

The mapping of linguistic relations onto Local configurations must therefore 
be justified in tin; same way that descriptions ate, by criteria such as ele¬ 
gance, pergpecujty, and expressive power. Lncality does not always result 
in elegant linguistic representations; nor can all interactions he naturally 
modeled in this manner. 

Nonlocal relations are broken into a chain of local lelatlons in order to ex¬ 
plain potential interactions with intervening elements. When there Ite tie 
actual Interactions, the resulting representations are inelegant, containing 
superfluous intermediate dements. A uniform bound on the domain of re¬ 
lations, as in Koster {19-57), will allow too many potential interactions that 
won't occur. More seriously, as shown by the constructions in this chapter, 
it is difficult to prevent undesirable interactions from, occurring in such a 
system of representation., (The alternative is a relativized bound On the 
domain, that models all and only the actual interaction* with intervening 
elements. Hut this is identical in conception to a nonlocal relation, antithet¬ 
ical to locality.) 

Not all interactions can he naturally described in terms of local configura¬ 
tions. For example, a linguistic relation that depends on elements outside 
its domain cannot modeled via local interaction- The transit] ve relations of 
obviation, arising from the binding theory, have nonlocal effects on relations 
of antecedence, and these interactions are not naturally modeled in terms of 
local configurations. 

3^2 Thfi search for uniform mechanisms 

The pursuit of general mechanisms for linguistic theory—such as feature 
unification, the uniform Local decomposition of linguistic relations, or the 
coindexing mechanism of Barrier* —ha* repeatedly proven treacherous in 
the Study Of language. It distracts the attention and efforts of the field 
from the particular details of human language, that is, what are the true 
representations., constraints, and processes of human language. 

General mechanisms have also invariably resulted in unnatural intractabil¬ 
ity, that is, intractability due to the general niedlMkisms of the theory rather 


thfcft the particular structure of human Ifriiguagc, This is be , c.a.usft no oitc 
mechanism has been able model all Lhe particular properties of Imman Ian 
guage li’L.ess it ig the Unrestricted mech&rLtsm, However, the unrestricted 
mechanism Can also model unnatural properties, including computationally 
complex ones. 

I]t segmental phonology, rules are needed to insert, delete, and transpose 
segments, Rules are also needed to perform arbitrary Substitutions, as in the 
case of irregular forms. Therefore, we conclude that phonological rewriting 
rules must be completely unrestricted, However, this is a false conclusion, 
because we have entirely ignored the possible interactions between rules. In 
an unrestricted rewriting system, each ru.e applies to the derivation siting 
ifi a MarlovLan fashion, entirely oblivious to the previous rules that have 
applied. But in a phonological system this is not the ca&e. A phonologycaj 
rule cannot delete & segment that was inserted by another ruler inserted 
segments are never rewritten and then deleted. Nor can arbitrary segmental 
strings he arbitrarily rewritten: irregular forms may only be rewritten at 
the interface between morphology and phonology, where all morphemes are 
rewritten as segmental strings. 

In current syntactic theories, many different types of agreement are us^d, 
including specifier-head agreement, head-complument agreement (selection)., 
head-head agreement, head-projection agreement, and the various forma of 
chain agreement (link, extension, composition). Therefore, we conclude, alt 
agreement may be subsumed under the most general agreement mechanism, 
either feature unification (as in (JESG/LFG) or the coindexrng operation (as 
in Barriers). However this conclusion invariably leads to unnatural analyses. 
-Specifier-head agreement includes the relations of morphological specifica¬ 
tion and of predication, which is the saturation of an external theta-role. 
Head.complement agreement includes selection relation, which is sensitive 
to an entirely different set of syntactic distinctions than morphological spec¬ 
ification is. So, for example, the agreement morpheme represents certain 
dis'suctions auch as person, gender, or number—that selection is- no-t sen- 
si live to, it .east in English. English verbs do not select plural objects, al¬ 
though they are morphologically marked be the plurality of their subjects. 

J he assignment of theta-roles is likewise insensitive to number, person, or 
gender. When all these particular forms, of agreement are Subsumed under 
one general mechanism, whether it be unification or Coin dex i ng, unnatu¬ 
ral forms of agreement invariably recult from interactions. (The unification 
mechanism, however, is considerably more general and powerful than the 


coi indexing mechanism of Harriers.) The complexity investigations in this 
chapter have exploited this flaw in current transformational theories., by 
simulating the unnatural stair construct ion. 

in a way, iln'se ovwgiMraliEatioii* reflect the mJnd.s«t of formal language 
theory, which is to crude]}' equate structural complexity with syntactic form. 
By choosing the least general rule format that include* all the natural rules, 
we need not allow any unnatural rules. However, as we have seen, we do 
allow unnatural comp u tat ions, because the mulling rule interactions are 
almost surely unnatural. 

The remedy is, we must adopt the mindset of computational -complexity the¬ 
ory, which is to equate structural complexity with computational resource*. 
In order to limit resources, we inusl limit the number of possible rule snterac- 
■•i.'Ji J'. I I ■•' : 11 I :i ■•■.! i ■.'"■■ I m--." ii it- i -. ii: ! . i : I. : r i n -m : I - 1 ' :i;!-. 

of linguistic constraints, that limit interact ions among linguistic processes. 


Chapter 4 


Complexity of Anaphora 


This- chapter defends the thesis that human tanguage is NIP-CMnplcte, Jn 
order to defend 5U.fh a thesis, w must defend both the tipper bound, that 
linkage comprehension is in .W, and the lower bound, that language 
comprehension is ^P-hard. 

I 1k l chief obstacle that ve fare in defending -eLcher bound ia the LncomplcU!- 
ucss of our understanding of human language, Because our understanding 
is incomplete, it would be prematuro to formalize the linguistic theory, It 
would also be meaningless, because in order to be precise, a formal model of 
human language would make statements that could not be Justified scientif¬ 
ically. Lacking a comprehensive formal model, jt is not possible to prove the 
upper hound, b'-or can we prove the lower bound without a. precise statement 
of the language comprehension (LC) problem. 

We overcome the impossibility of defining the complete LC problem, as fol¬ 
lows. Firs* wo select a natural cla_ss of utterances, and li&c the scientific 
methods of linguistics to determine what knowledge language users in fact 
have about those utterances. Next, we construct the simplest theory of that 
knowledge, under an appropriate idealization to unbounded utterances. Fi¬ 
nally. we pose the abstract problem of computing that knowledge for a given 
utterance. This problem is a natural tubprnblem of .An..guage comprehen¬ 
sion. because in order to comprehend an utterance in that clans., the language 
UECT must compute that knowledge. Therefore, the complexity of such asub- 
probleni ia a lower bound on the complexity of the Complete LC problem, 
by the principle of sufficient reason. 


Although wc candot prtiwe the upper bound, we can. still accumulate empiri¬ 
cal evidence for it. One way to confirm a thesis is to CtihlttM its. predictions. 
The tippet bound makes tbs 1 following prediction; if an analysis of a linguis¬ 
tic phenomena leads to complexity outside A'T 5 , then the analysis is in error, 
and the phenomena has an empirically superior analysis whose complexi ty is 
inside A (V, Therefore, every time that we improve an analysis while ruduC‘ 
ing its complexity from outside r VP to inside _VT\ w* accumulate additionai 
empirical evidence for the upper bound . 1 

In this chapter. <m illustrate both tipper and lower bound techniques with 
respect to the language user’s knowledge of Anaphora, Aiidphiim Is the 
process of interpreting anaphoric elements. An anaphoric or re/erenri'aifJy- 
dcpcndcnf element is a word or morpheme that lacks intrinsic reference, such 
as a pronotltl or an arikptoor. An AnapAo-T is a reflexive., such as is marked in 
English by the i+iseljf] suffix, or a reciprocal, such as the English eacft EStAer, 
A p-rwwWi Is an dement that does ISOt have its own reference, such as the 
English they, apd is not an anaphor. 

In order to completely understand an Utterance, the language user must de¬ 
termine the intended antecedent of every anaphoric element in the utterance; 
Otherwise, he has failed to under?tand the utterance. However, there is no 
known satisfactory characterization of what is “the intended antecedent of 
an anaphoric element.* It cannot be coir antecedent t because tills results In 
the trivial language miscomprehension problem. Such a problem statement 
would allow the trivial fnouj&olutiou where every pronoun in the utterance 
is assigned a distinct antecedent, none of which were previously mentioned 
in the discourse. 

In order to overcome this difficulty- without making any unjustified OF hHh 

'due piece uE ovidrnce far chi', upper bound, may be found in Rj$c*d t* hi._t, 

proves that Lfie nnmniJ FecognLLiEm problem (L"RF) fOE £onnraliM:ii phrase fltrnoturr- 
prim mars (GPSG) ia Eu''Ef > POLY,hartl. and aimws hunv to* construct ar, empirically superior 
"Revised Cl PSG whom URP Is- M P-compfet*. AMri-brtl piece ef tvideact is the analysis of 
phnnntogirrtl the orie<i in chapter 2. wltrwr rr.-i:'ijilr-xavy is mejeed Iram urdtaJabw' Uf inxsde 
fy'P. Tfc rny knTnrJKl.se, no Other linpristK IhftfiTy h.u been proved. U Lave a rt> dipLenity 
iju Lside or fy'V. Th.c wn-rk of Friers and RiLcbie (l?73J. who proved ibut a formal model 
cf their c^vn design was undecid abte, icd Rouadti (]£FjS), wliQ pTirved tji*t 4. restricted 
venJon. of Llie Peteri'KllcbLe modeL wraa expon i-ati v Eltno, in n<rt relevant liwaiisr their 

-1^1 n.Mifol waandtLadeiwaiilerE.lv propweel by Linguist*, or ever defended an a remolely 
plausible linguistic theory. It imo-jnis to JiLlle more than 4 very liicae intf-rpi-rritlLin nf 
Eht BE-sieUient “a grammar maps deep structured to surface RtracliireK by the repealed 
a p plica I Loti (iS isfHiTjvi ns rules-. 11 


67 



necessarily strong assumptions, we reouire that solutions to the anaphora 
problem introduce do new information. That is to say. the antecedent of 
each Anaphoric element must be drawn from the sel of available antecedents, 
in Lhe current Utterance and in previous utterances, produced earlier in the 
discourse. Tin? anaphora problem must also he defined in terms of structural 
descriptions, and not expressions or utterances, in order to prevent the triv¬ 
ial solution where the expression is assigned the null structural description, 
and DO anaphoric elements are understood to be present. 

For these reasons the emapfoafn pnsWtffl is; Given a structural description 
S lacking only relations of referential dependency, and a set A of avail¬ 
able antecedents, decide if all the anaphoric elements in $ can find their 
antecedents in ,1. The set of available antecedents models the context- in 
which the utterance i-5 produced. 

The chapter is organised into three sections. Section -1.1 reviews the Lan¬ 
guage user's knowledge of referential dependencies, and proves in two en¬ 
tirely different ways that the anaphora problem m NlMtird, thereby estab¬ 
lishing the NP-hardneB* Of language comprehension as a whole and den-mu¬ 
st rating the power Of i direct complexity analysis. Section 4.2 shows how 
A Widely-accepted analysis of the Linguistic phenomenon of ellipsis leads to 
A proof that the anaphora problem is PSPACE-hifd. Next T ( falsify this 
analysis empirically, and sketch an improved analysis of ellipsis that reduces 
the complexity of anaphora to MV. This illustrates the utility of the upper 
bound. The tonclusioD evaluates an alternate approach to the mathemat¬ 
ical investigation of human language, based on the complexity analysis of 
linguistic theories, 


4.1 Two proofs of the .VP lower bound 

In this section, we use hade facts About the language user's knowledge of 
referential dependencies to prove in two different ways that the anaphora 
problem is NP-hfcrd, What is it- that Language users know about referential 
dependencies? 

For one, language users know that an anaphoric element c* may inherit the 
reference of an aTg Um e 1L t i n Tod(il Anri himself lt where the anaphor 
hinwelf h understood as referring to the proper ItOuu Todd : or Todd, said 
Mory liked which could mean that Todd said Mary lihed Todd 1 . The 


judgement of coreference between u and its antecedent is depicted here 
by assigning q and 3 the same Integral subscript. (Careful attention must 
be paid to the the intended mterpretatioil of the anaphoric elements in the 
examples below, as indicated by the subscripts.) 

Language users also know that am anaphoric element must agtee with Its 
antecedent in certain respects. Examples (13) are possible only if Chris Is 
masculine, whereas (14) 4J<? passible only if Chris is feminine, 

Chrisi liked himself; - 
b. Chris t thought Bill liked hi:r. L . 

Chrisi liked hiredf;. 
b. Chris I thought Sill liked her!. 

This condition on agreement is transitive, a» illustrated by the paradigm in 
(lo), where iftc student can be masculine (15a) or feminine (15b), but not- 
both simultaneously (15c). 

(IS) 

•'a. The Studenti prepared her], breakfast 

b. The student; did his; homework. 

c, ^The student* prepared her; breakfast after doing hist home¬ 
work ■ 

The asterix 4t! ' is used in, (15c) to indicate the impossibility of the depicted 
interpretation. 

These facts of interpretation arc widely described as the standard agreement 
condition (SAC)- alL anaphoric, elements that share an antecedent must be 
uondislinct from it and from n:l.er, Two elements are mmcftstostf if 
and only if they do not disagree on any common feature (that Js, they may 
bo unified), 

PrOtlOuns in different language® arc marked for a wide range of distinctions 
in person, gender, number, animacy, case, social class,, kinship, reference, 
antecedent ikiuh class, grammatical function, thematic rnie, and so on (cf., 
VViesemann lt?SC; Stills 1907)- It is true that every particular language con¬ 
tains i fixed number of agreement features. However, Linguistic theory Ltltv 
alisc-s to an unbounded number of agreement features because those feature:.- 


m 


and tine range of their possible vainer varies considerably from Language to 
language and does not seem to be restricted in principle (see appendix A.2). 

Th* fact that aJ] anaphoric dements that share an antecedent must agTee 
with it arid with each Other. 221 com hi nation with the fact that anaphoric 
elements must have antecedents, suffices to establish the fallowing lemma. 

Lemtrm 4,1.1 Anaphoric a^nriemrni ccm simulate, graph t-co/orinp. 

Proof, On input k colors SUld a graph G = (V r £) with vertices V r - 
{f] ji'l:, t?*} and edges £, we construct a linguistic expression 5 contain¬ 
ing |^| pronouns and k available antecedents such that G Is fc-colorable if 
and only If the pronouns in S can refer to the k available antecedent*. Avail¬ 
able antecedent? correspond in colors; pronouns In S correspond to vertices 
in G , and disagreement between the pronouns in S corresponds to edges Ln 

G- 

To do this we need the n binary agreement features .., p„i the pro* 

nouna and the available antecedents if], Jfy,..., R k , Each R, 

Is ap argument, such as a noutl phrase. Pronoun pj represents vertex v,; for 
each edge € E attached to pronoun p, has ^ = 0 and pronoun 

Pj has <p t = I. It does not matter how the pronouns and arguments are 
arranged in the expression 9. provided that every argument is a possible 
antecedent fur each pronoun, and that no other linguistic constraints inter¬ 
fere with the disagreement relations we are constructing. It is always trivial 
to quidriy construct such a sentence, as we did in example (15). 

In order to be understood, every pronoun must refer to one of the k available 
antecedents. If there is an edge between two vertices in the input graph G t 
then those two corresponding pronouns cauinol share an antecedent in the 
expression S without disagreeing on some agreement feature. Therefore 
each permissible interpretation of 9 exactly corresponds to a &. coloring of 
the input graph G , □ 

The reduction uses n binary agreement features, One for each vertex In the 
graph. The feature system is used to represent subsets of the n vertices, and 
therefore must be capable of making an exponential number of distinctions. 
(In terms of the input length m — |C?| h this feature system iss capable of 
making distinctions.) 


70 


4,1,1 Agreement rccmiiidered 

Agreement is always stated in tenets el neodistSaetnesg of features (cf, SAC). 
The ncmdistinclmeES relation may be broken into three mulually-exduEive 
ajHactively-exhaustivcs subcases as follows. Two oonidenticnl elements are 
nun distinct if and only if (5) they are orthogonal (have no common fea¬ 
ture, eg., [■Pin] and [+ma.se]); (it) one the other (is i?tTJct.lv more 

general than, eg.., [-fplu] subsumes [4pla,+ma£cj); or (iii) they partially 
overlap (have some bat not all features in common-, eg-, [-ptu,person I! and 
I-ptu,+mMc]). 

The Standard agreement condition males two broad empirical predictions. 
However, neither appears to be true. 

The first prediction is that there are languages with HOP identical HOndistlnct 
pronouns. Otherwise, the SAC cannot be falsified, and should he abandoned 
because it is unnecessarily powerful. The Strongest confirmation of the SAC, 
then, would be to find languages for each of the three subcases of notldU 
tmetness: languages with orthogonal pronouns, languages with pronouns 
that subsume each other, and Languages with pronouns that partially over¬ 
lap. Such a language would have a pronoun system that could not be written 
down jo the Standard textbook format (a chart that partitions the space of 
possible feature combinations). 

Although I iuii out of my depth at this point, I observe that English is not. 
such a language, nor c-onld I find Btich a language among those spoken in 
Europe, or in the fourty nine non-European languages discussed in Wiese- 
Iflinu, (158b), The English system of personal pronouns exactly partitions 
the space of possible feature combinations, as shown in the following table 
(suppressing case marking)! 


singular plural 


1st 

I 

we 

2nd 

you 


state 

! be 


3rd 

f em 

she 

they 


naur 

! 



(Thu pronoun one is not a counterexample, because it does not have a 
Linguistic antecedent, Rather, it has an arbitrary interpretation and may 
only “accidentally" corefer.) 


71 













Tin: second prediction is l hit non identical not'idi-si-incl premiums can share 
ar. antecedent. Given the diffi-C(alty of obtaining trttLy nondistinct pronouns, 
this prediction is not easllv tesled. However, wc may imagine the following 
scenario. You are ted ki ng to a new friend about his recent health checkup, 
and, not wishing to make any assumptions ahont the gender of the examining 
doctor, you use the third person plural pronoun (Ida) instead of either third 
person singular pronoun (16b}. 


(in) 

:L i 

b. 


About your doetOn, did the>'| seem. Competent? 

About youi doctor], did he L /she] examine you thoroughly? 


In this case, (ftcy is used as a pure third person pronoun, unmarked for 
number or gender, and hence subsumes both Ac and sAe. However, i: is -,1 j]| 
not possible to use fftdp in combination with Ac or she, contra the Standard 
agreement condition: 


,. l J ' ’'About >'OUr doctor |, did theyi seem. competent when h*i /she^ 
examined you? 

rhose and related, facts suggest that the agreement condition mus| be strength¬ 
ened, to State that every argument itl a sentence must have a unique anaphoric 
root morpheme m each of the Language’s anaphoric systems.. That is, we 
assume that the systems of personal, relative, demonstrative, and interroga¬ 
tive pronouns are all independent. This version of the Agreement condition, 
which partitions the set- of anaphoric elements into equivalence classes, is 
the anaphoric EtyviL’ai'crice condition? 

Tiie anaphoric equivalence condition heips to explain the uncertain nature 
of obviation violations between pronouns with distinct anaphoric roots. The 
split antecedence examples tn (IS) demonstrate that—unlike obviation vio¬ 
lations between pronouns that share the same anaphoric JrooL morpheme— 

J Th.is condition may tatidetlyinjly he ■. fact about language uqii initios, It see ms that 
Ihfi Un&uagE acquisition device Rral chaaM-j t cf fiirvaul s*m antic, syntactic and 
phonolofiiril iLiatjnclians, pcm pJttlitiiiUi the ™Lti n g fratUEf *p«e into natural diHW, 
anrt finally Butign* a phonologically diutijiclLTe anaphoric root JBCrpheme lofltldl. daas fdL, 
Chdat L98fi). One consequence of such an acquisition procedure mcniJd be the AElUpkonc 
■ "j ii i v^_ij:r..rc cor nil :.ii Lmmalui;.. Hie .ir .l) il< ::-r; -■ rq u i valence cwidiiiob i> a lioguisik ( . r - »- 
CM3 ihat simjiLifi*? the relation between sound tad eneamng: one* 4 ptaaouD is ayntlW- 
Lically linked to a given Antecedent, then the sound -nd chit, pranaun comtl l* rlar.d foT 
that antecedent wi lIilq the scope of i|iO Utterance 


72 




obviation is not always enfoCMd between pronouns with different anaphoric, 
loot morphemes. 

'^"a, Jolisii suggested lo Billj that [in'; shoot thMIlfijj], 

b. Bill] reminded Sue; that }hei introduced them.;],}) to the pope]. 

Rather. lexical factors, such as. the dtoiw of verb,, play a significant role. 

regt 

-a. NavarreL suggested to Benedict? that [hej persuade them ; _f]^) 
[PRO,- to perjure thirrtselws;]]. 

b, *Niivarrei suggested, to Benedict} that [lie; tell them ; _n^) 
^PRO,- to perjure themselves,;j. 

4.1,2 R^latiaus of referential dependence 

We have seen that anaphoric elements must have antecedents, suhjert to 
an agreement condition, perhaps the anaphoric e<njj valent* condition. A 
second component of linguistic knowledge is that that pronouns must be 
disjoint in reference from certain arguments. Foe example. *v@ry English 
speaker knows that hurt him.] cannot mean that ‘Todd hurt Todd*, 

This judgement of disjoint reference, that *• cannot refer to £, is depicted 
here by assigning o the integral subscript of 8 preceded by an asterisk. 

Knowledge of corefennn-ce and disjoint reference must be represented in the 
brain of the Language user, and by every scientifically adequate linguistic 
theory. The simplest, and least controversial , representation is to postulate 
p. wo abstract Linguistic relations: an asymmetric link(a, 3) relation that holds 
between an anaphoric element a and its immediate antecedent ,d. subject to 
the agreement lomiirion, and a symmetric obviatefo, 8) relation that holds 
between iwn arguments a and j3 that cannot share any referential values 
(Higginbotham, 108 b). 

Every structural description, then, includes: a directed graph of link relations, 
as well as aiL undirected graph of obviate relations whose vertices are the ar¬ 
guments in the structural description and whose undirected edges represent 
the obligatory nonoverlapping reference of two arguments. Let this graph 
of referential dependencies, which consists of the obviate and Link relations 
of a given structural description, be called the RDG, 


Our goad Is to prove the NF-hardnesE of the amphora problem without using 
the agreement condition. The idea of the proof will be to construct RDGs 
that Simulate some N P-COfflplcte problem. Let us therefore equine a range 
Ot Syntactic configurations, in order to better understand the distribution of 
link and obviate relations in linguistic representations. 

J'he first, most important, syntactic configuration is “local ^command": an 
anaphor must link to some locally c-commandLng # ;md a pronoun mast 
obviate all Such jJ. 3 To a first approximation, two dements are lota! if they 
are CO-arguments, that is. arguments of the same verb oj noun. [The e^&ct 
definition of locality does not matter for ntir purposes; all that matters here 
j£ the fact that antecedence ar.a disjoint reference are possible or necessary 
til «nne configurations, and not in others.) Wes say /J c-commanrfs ct in a 
phrase structure tree if and only if ail branching, nodes that dominate 3 in 
the tree also dominate or. In particular, the direct object c-commands the 
indirect object, and the subject of a daiasie C-Commands both direct and 
indirect objects. 

An anaohor must be linked to a unique c-commanding. argument, and this 
argument must be Local, This is illustrated by the paradigm in (20), where 
the domain of ;ocaiity is indicated, by square; brackets:. 

(20 j 

^a. JJohti] shot himself]] 
b. [John] introduced Bill} to Itimsdf^j] 

U. )ohffl( thought Billj said ;Mary liked himself. 

Example (20bj shows that an anapbor can take any c-commanding an¬ 
tecedent inside the local domain; example (20c) slides that am anapbor 
must have some antecedent inside the local domain. .Wary js not a possible 
antecedent for himself in (20e) hecause they disagree on gender. 

HroilOuns are locally obvj alive: a pronoun cannot share referential values 
with any argument that ^commands le in its bead domain, (The domain 
of locality is roughly the same as for anaphora; again, all is needed for the 
proofs below is that there exist configuration* that result-in obviation.) This 
is illustrated by the paradigm In (21). 

1 rbt retiuiniuiEnt th >vt Sjiaphon Jiaw 4nE«£=dE ntE u ’‘eanditioa A M and tJie 

JEquiremeut that prolan Hi be focaJly optative i* caJLed ‘'coBdiqon B" ui itt lih Rustics 
literal ure. 


74 



^ ■*. [johni shot htm.i] 

b. 'John] introduced Billj to 

c. John] thought Ttilt-j said [Mary liked MftLi^gj 

Example ("21 b) shows that a pToncvun is disjoint front ait locally C■ commands JLg 
arguments; example (21c) shows that a pronoun can link to any argument 
outside its local domain. 

Lika the agreement condition, the prohibition against sharing referential 
values is a transitive condition that is enforced globally, as shown by the 
para Jilin: in (22). 

(£21 

a, Johni saM that [Bill; liked him]/.^j. 

b, Johim said that fhti/.j liked Bill:], 
e. *Jehri] said that [hei liked liinii], 

fhm can refer to John in (22a); he ran refer to John in. (22b); but ho and 
him cannot both refer to John in (22c>, because As locally c-rommands him 
and hence they are cibinatjve. 

Obviation applies equally to alii linguistic coreference, including the intra- 
and extra-sentential linking of pronouns, because obviation cannot be vio¬ 
lated, even if a pronoun and jits antecedent are in different sentences. With¬ 
out loss of generality then, all linkings: in lks£ chapter will be intrasententlaJ 
for expository convenience. 

The other local c-command configuration is "exceptional case-marking' * 1 (ECM), 
where the subject of ml ECM verb locally c-commands the subject of its in¬ 
finitival complement. This is illustrated by the paradigm in (23), with the 
ECM verbs TF&riJ and ctptct* 


(23). 


b. 


John] wants jhimself: to shoot BILL) 
John 3 expects [him.] to shoot Bill] 


Examples (23) demonstrate that the subject. John of mi ECM verb locally 
c-commands the subject a of the infinitival complement [q to shoot BilQ | 
for both iflaphors and pronouns. 

*TJieae verb* am r-.illn-a 't-xv^plLa.h.i.L (-.uwvm bKUK, uctiJce other vetb* ih aL 

1 a fill ill? ilauifal can* pie me at, E-Oh! veibi : a>i? in. jn.fi niuval clfe U^&l Complement and 
assign thstrscl Cnae la Its nilijcci Other ECM Vet-b* in English include Wirne. prefer, 
live, ir.i: iclnli-d verbs 



Our goal is to prove that the anaphora problem. is NP-hard, without using 
any agreement features. Let us therefore pause to consider how such a proof 
inighL work. 

Imagine that we must color the following four-vertex graph G 4 with three 
colors: 

f(l>^i(S^).{3 t 4)p(4 t 2)} 

Then our reduction might construct a sentence containing three Available 
antecedents and four pronouns. The first part of the sentence, Before Bilt a , 
Thtijij., W Jack n-^nj jHendsu.,, would represent the three colors, where 
each proper noun correspond? U> a different color. The second part of the 
sentence Mould have an obviation graph equivalent to where the pronoun 
Pi in the sentence correspond k to vertex i in f7 4 . As expec.ted, it is difficult 
to understand the resulting sentence: 

(24) Before Bill*, Tom*., and Jack,. were friends,, 

[hS] wanted him 2 to introduce himj to him*!. 

The Corresponding obviation graph appears in (25). 

(25) 1 



hwh vertex is labeled with the numerical index of its corresponding pronoun, 
and each edge is labeled with the syntactic configuration responsible for the 
corresponding obviate relation. (Recall “eoarg 1 ' means “argument of same 
verb or noun," and “ccnT means “exceptional case marking 1 ’ con%uration.! 

By carefully grounding the reference of each pronoun in turn, we can confirm 
that the Obviation gTuph for (24) exactly corresponds to the four-vertex 
graph <j 4 , Lei lW t link to Bilk >" the sentence—this corresponds to coloring 
vertex 1 in G 4 with the color a. Then in the simplified sentence [FiM wante.d 




him% to introduce. bi'mj to him 4 ] we can dearly ess that M can be the 
antecedent of any pronoun but hin\i —this corresponds to where coloring 
vertex 1 with a given color only prevents us from coloring vertex 2 the 
same color. Continuing in tins fashion, one can convinces oneself that the 
pronouns in such a sentence can find their antecedents in the sentence lJF 
the corresponding graph G 4 is 3bcolorable. 

The local c-command configurations used in (24) only give rise to simple 
obviation graphs, and therefore the second proof of NF-bardnesE will employ 
throe additional syntactic coufLgu rat ions: adjunct control, strong crossover, 
and Invisible obviation. 

In lhe expression Sue screamed before jumping, ail EngLish speakers know 
that 5ejc is the. understood subject of the gerund jumping, that is, every¬ 
one knows that Sue did the jumping. In order to represent this linguistic 
knowledge, as we must, we may postulate a silent pronoun L PH0' in the 
subject position of the adjunct \bcforc lumpmjj, and obligatorily Link PRO 
to Sue, 


(26) $u*] k warned [before PROi jumping] 

This is called IL subjer.t control s because the reference of PRO is controlled by 
the subject of the main clause. Another example of subject cont rol appears 
ia {2 7a) with the verb promise. Contrast this to the verb persuade in (27b}, 
which is an object control verb. 


(27) 


a, 

b. 


Tom 1 promised Maryj [PRO^.j to attend school 
Tom] persuaded Miry 3 |PRO ¥] ^ to attend school] 


Further evidence for the existence of this silent pronoun comes from, its 
interaction with overt anaphora. Observe that himself must refer to Bill in 
{'i^a.), and 4p?n must be disjoint from Biffin (28b), 


V Mark] vomited |after PRO* getting himself] plastered] 
b. h*lari(] vomited [after PRO 1 ! getting, hid,; plastered) 

Without PRO, such facts are a complete mystery. But once the understood 
subject of the gerund is explicitly represented using PRO, as we have done 
in (2?), the facts arc trivially accounted for as canonical con figurations of 


77 


local c-command between silent PRO and an overt anaphoric dement. (In 
any event, the complexity reduction proceeds whether PRO exists or not; all 
lhal matters lot the reduction 3s the empirical fact that himself must refer 
to fJittind film must he disjoint from Dill in expressions Hike (28).} 

''Wli-movement" is the configuration that holds between a wIt-phrase, such 
as [tthii] ot fiuhcil person], that appears displaced from Its underlying ar¬ 
gument position, fipr ex&iJtpk, In HTioe; did Mary sos tt, the underlying 
position of the wh-phrase who k as the direct object of the verb xr.e. has been 
marked With a, trace f*. roindesed with it-, This represents tlic fact that 
ivhojt stands in the same relation to the verb .Sec as it does in the related 
expression Mary sou' nfio. 

'Strong crossover occurs when an anaphoric element o- intervenes between a 
wh-phrase and its trace, and c-commands the trace, let such a configuration, 
ct obviates the subject of the tvh-phrase. This is shown in (2$a), where the 
pronoun he C-commands the trace f* of the wb-phrase [irAicft person], and 
for this reason must be understood as disjoint from the p-erson who Mare 
kissed. 

( 29 ) 

a. [Which person]*. did he.jt say f t kissed hfary,. 

b. [Which person]* (jj said he*, kissed Mary. 

In (29b), however, there is no strong crossover configuration. and no obvi¬ 
ation. That is, (29b) has an interpretation of the form, *foi which person 
T * did x say t kissed Mary.* 1 Such facts are difficult to exnlain without 
art explicit trace, because the wb-phrase pCrsonl and the pronoun he 

fctand in the same relation in holh sentence?. 

In the expression The man uhot Mary say? I*, we say that heads the 
relative clause ficfto Mary sat^, and that it predicates Its subject, [the man), 
Whon the relative clause contains a pronoun in a strong crossover configu¬ 
ration, then the pronoun ohviates the subject of the relative clause as in 

(30) - 

(39) [the man]t [who* he.] likes t*). 

"Ellipsis is the syntactic phenomenon where a phrase is understood but 
not expressed in words, as in The men at* dinner end the iiiom-cn did too. 
which can only he understand to mean that 'the women did eat dinner t00 r . 



For lliis mtnpie, we would say that the verb phrase \cai dfrtlter] has been 
ellipsed in the second conjunct; this is called VP-eHipsifc. 

A con hgu ration erf “invisible obviation"' arises bet-wee ['l the subject of to 
ellipsed verb phrase and. the direct and indLiect objects of the overt verb 
phrase, because both subjects in effect locally c-command th* other argu¬ 
ments rif thfl verb. Observe that him can refer to Romeo in (31a), but not 
in ( 3 lb), 


[311 

J a_ Mlaj]£ l wanted Jesse to love him] 

b, “Marhi wanted Jesse to (love him 3 ] before FROl alJowing 
himself] to [ 0 ] 


In this example, the anaphor ki-m&eifib obligatorily Imbed to PRO* and PRO 
to the matrix subject Mark. The pronoun hun invisibly obviates, hitn.i&.lf, 
the subject of the ellipsed VP. because they are in an. invisible configuration 
of ideal c-Comm and, The RDG for {31b} appears lit (32); 


(32) 


CQturoi 



PRO 

Kin 

hirrtSftlf 


Single Lines depict relations of obviation, while double hues depict relations 
of cjorefcrence. Vertices are labelled with the corresponding noun phrase 
arguments, ar.d the edges are Labelled with the configurations to which they 
are attributable. 

£t must again be emphasized that the complexity classifications of this sec¬ 
tion m no w r ay depend on the existence of traces, PRO, or on any other 
details of the Linguistic analysis. The reduction relies only on the empirical 
facts of obligatory disjoint reference in configurations SUth as (23b), (20a), 
(30), and (31). The linguistic analysts is included for pedagogy., and to 
organize the presentation. 


n 







] his concludes our survey of the language user's knowledge of referential 
dt 2 pendence, which has heen studied extensively. The next step is a direct 
complexity proof. 


4.1.3 TYom sEiiisflabiUty to referential depemdemce 

I'lie conceptually natural reduction is from graph coloring to the anaphora 
problem. However, the transformatioft or arbitrary graphs into linguistic 
representations is cumbersome. To overcome this diJflcultj.% we might reduce 
"rom 3SAT, by way of the .RTapli 3-coloring problem. That is. On input a 
Boolean formula / in 3-Crf F, we would first use the classic reduction of 
Lawler (1H76] to construct a corresponding instance g of the graph 3-coioring 
problem, Next, from g we would construct an instance (j, o) of the amphora 
probkm. such that the anaphoric elements in i can nnd their antecedent 1 ! 
maittg is 3-colorable l and / is satisliable.) Hy restricting our attention to 
this class of “■as AT graph colorings," we would simplify the reduction Into 
the task of transforming a simple class of “difficult’' graphs into linguistic 
representations. Of course, the intermediate “graph 3~coLoring 1: stage of the 
reduction won't really be used ip. the proof. 

Lemma 4.1,3 Referential depenrfenrics can stmsftKe 55.4 T. 

Proof, On input A Boolean formula / consisting of the clauses Ci, C ? r ,,, .C p 
m the variables ,ri , J-j,,,,, af B , we construct A linguistic representation s and 
a SOt Of available antecedents a such that the anaphoric elements in j tiIl 
find their antecedents in a iff / is satiahahle, 

The eet a will contain exactly three distinct antecedents: True, Fal/te. arid 
jYeofjnaf These noun phrases represent, respectively, the three possible truth 
values 'true 1 ; "false 1 , and ’unajssjgned.' 

For every variable Xj in /, j will contain two pronouns, one to represent Zj 
and the other to represent i ts negation Tj, In order to preserve the semantics 
of negation f these pronouns must ohviate each other. Bulb will also obviate 
the proper noun Neutral, and tlLcrcfore can only link to 7tnf rjr Fake.. To be 
precise, for every Boolean variable wo build an object control construction 
(33) that eontalBi two possible targets for ellipsis, VP L and VPj, 




03) 


s 





V 

! 

persuade 



I 

Ncinral 


This is the phrase structure that would he assigned to expressions such as 
Hr pcrsuadeA litm to introduce him t<i HctUtt, 

In tins construction (33), configurations of local t-Commafid iii the lower 
clause cause mutual obviation among PRO, the pronoun for Xj, and Neutral. 
The object control verb pe rsucnJc obligatorily links FRO to the pronoun for 
as depleted by tile arrow. The subject position is filled with a dummy 
pronoun so as not to increase then number of available antecedents in the 
construction. The resulting EDO ensures that the pronoun for zy must 
refer to eitheT True or False; that the pronoun for T, itlU-St refer to the other 
antecedent; and that neither pronoun can icier to Neutral. In shorty the 
construction (33) correctly ensures consistency of truth assignments, as well 
as correctly representing the semantics of Boolean negation. 

Both pronouns are in the direct object position of a separate verb phrase, and 
hence possible targets for VP-ellipsis and the resulting invisible obviation. 
We will tale advantage of this below, by representing positive literals of j;, 
as ellipsis of the higher VFj h and negative literals of IJ as ellipsis of the 
lower VPj. 

For each clause Ci - (a, V b| V c,j, we construct, the rather intricate syntactic 
structure shown in figure 4.1, whose graph of referential dependencies ap¬ 
pears in figure >t.2. The effect of this obviation graph id combination with 
the limited set of available antecedents is to ensure that one of the three 
ellipscd verb phrases in figure d.l must contain a pronoun linked to the an¬ 
tecedent True, This corresponds exactly the rer|uirem.erit that each danse 
contain a true literal, concluding our simulation. Q 

It is unit to be expected that the linguistic expressions constructed by the 
preceding reduction arc easy to comprehend, any more than we expect to 
actually build the physical devices used to prove lower bounds on the com- 




fTTlIil.-tSlJTi 


NT VP 

■iVUB> 


ClUptica] VP 



us VP 

<S> 


Figure 'l l; This phrase stmciute sicnul^tcs the tth Bnnlr*Ti elauae Q = (c,- V 
6,- V(T a ), with irrelevant Stalls suppressed. Dashed arrows depict the sar«dic_ 3 .i]ciL 
a ivouji jiihrwe by an estraposed relative clause, U S” is a -clause. “NP 1 *' if a 
nmin phrase, “VP” bi verb phrase, and “PF" n preposition*] phrase. AU NPt 
■domJriJitr pcorj&una. The slrutture contains configurations of lural c-commaud. 
ationg crossover. adjunct controL, and invisible obviation, yielding the graph of 
referential dependencies shown in figure (4,5), The targets of the dbpsed VPa 
arr the VPs in the literal oonstruelkmi (33), depending oq the pdtrity of the 
Mrrwpcndinjj. literal. Comequently, KP^, NP S , and NP L invisibly .obviate the 
pronouns representing n,-, and it.-, respectively. This is the phrase Biructute- that 
Hvould be assigned In repressions such as Htcior nte( h Sl trpetteJ 

in M'flitr Arrrn framed, [tifepy dec btiitvtd Ac* did [f.j uniA afit-T exposing JtimsrJ/ a 

Ip {r] fvri A7 iefiy re idling himxtlft r* [tj. (Indices, traces of wh-movement, arid 
elliptical VPs are included here solely to help the reader align the expreaainn with 
its phrase BiruetiiTe.'i 

-------.-S2______ 







<Snm> 



Figure 4,2; This is the gr n.ph. of referential dependencies for the phrase: structure 
of figure 4.1. Single lines depict obviation relations; double lines depict relations of 
ccwreference. that is, links. and predications. Vertices in the graph correspond to noun 
peruse urgu mruis, .and ■'ice labelled with the identifying indices fiom figure 4 1. The 
edges in llie graphs are likewise labelled with the configurations to which, they are 
attributable. Recall that "euarg" and "cim" are oon&guralirKi* of local c-topi m nod; 
"wJiS" is For a strong ■crossover in voUmg a wb. phrase and "ellipsis" for the invisible 
obviation arising From the ellipsis of a verb phiase representing the relevant Houkan 
literal from ^33). The obviation graph that results when coseferenlLaJ. vertices are 
coalesced, in combination with the three available antecedents, ensures that at- kasi 
one of pronouns representing a., if, ore, must link to the proper noun Trwe This 
cwiwpcnds exactly to the constraint that a true Boole an .3-clausc contain at Irast 
one true literal. 


83 
















plenty Of problems in robot motion planning (Rdf l$7fl; Rdf and SharLr 
14S.5-; Canny ISSSj. Certainly, it is not possible to build physical device. of 
■5Uc.b intricacy, any more than it is possible for a language user to compre¬ 
hend the utterances, we have just constructed. Vet the practical questions of 
■what physical devices can and cannot be built, or of what linguistic expres¬ 
sions Can and cannot he easily understood do not concern ns here. We want 
to understand the theoretical structure of abstract computation?J problems, 
and use Complexity analysis to belter reveal this; structure. 

All that remains is- to state the main theorem of Ibis section, 

Theor-fini 0 The pnobicru £s HP-hard. 

Proof. By lemma 4.1.1, or by lemma 4.1,2. Q 

We now have two complexity lower-bounds for human language comprehen¬ 
sion that rely only on the empirical facts of referential dependency. It does 
Ttoi master H-jxtcify hou? the oaritizittins on- conference and disjoint tvgTerencc 
□ re .srofisd, only that their urr sued oorufitmuA as there are. in ali known 
I’ltmiKm femgangc*. For this reason, we may believe with confidence that the 
NP-hardness result applies to all adequate linguistic theories,. Moreover, the 
directness of the reductions suggests that the anaphora problem Is one of 
the more difficult subproblems of language comprehensiem t berxiUie graph 
^-coloring is one of the most difficult KP-COttiplete problems, & trap-door 
function with no known approximation scheme, and litiown to he average- 
case NP- complete. 

4.2 Evidence for an A f'P upper bnund 

Lacking a complete. Scientifically plausible linguistic theory, it- is not poss]- 
bie to prove an upper bound on the complexity of human Language. It it 
however possible to provide empirical evidence for an upper bound, and that 
is the task of this section. TEie argument goes as follows. Firsts we exam,' 
ine the linguistic phenomenon of ellipsis, and present empirical argument 
for a simple theory of the knowledge that language users have About such 
phenomenon. Xext. WO U&e Ibis theory to prove that the anaphora problem 
is PSPACE-bard. Using the insights of the complexity proof, we reexamine 
the phenomenon of ellipsis, falsify the copy-and-link theory, and suggest an 
empirically superior predicate-sharing theory. Finally, we prove that the 


anaphora problem 5s iEl jW according to the pred.Lf.ate-sharing theory. By 
reducing the complexity of ellipsis to inside A 'V, while strictly improving, 
the empirical adequacy of the theory of ellipsis„ we provide evidence for «n 
,\fV upper bound. 

Icl developing our simple linguistic theories t we will briefly introduce the 
relevant phenomenon, state the theory', and conclude with n concise ClUi- 
mer&tion of the empirical arguments in support of the theory, 

4,2,1 Simple theory of ellipsis 

A contra! goal of linguistics is to explicitly represent the knowledge that 
language users have a hunt utterances. Let us puntVp os n matter of con- 
wwence distinguish the repMMntatlefi of how an utterance expressed in 
words and phrases, from a representation of the logical aspects of its mean¬ 
ing, such as referential dependencies, and predication. Let us call the former 
representation the snrfaec. form of the structural description, and the latter 
representation, the logical form. {The number of levels of representation 
does not in and of itself affect the computational; complexity.) 

Theory 4.1 The logical form of ellipsis ts constructed by (recursively) copy¬ 
ing the. oa^rt structure into the position of the (VrctSponding tHipsed Sin i£- 
Itire; cir* anaphoric c/crncnf a m ay link to its antecedent either before or after 
copying; ia'Acu the antecedent of (V is « quantified ,\'P 0, (hen & must link to 
ft after copying. 


Eviri*m:K. Erst, the ellipticai structure is understood as though it were 
really there, as sh-owp in (34). 


(34). 


a- The men |ule dinner; and the women did [c] too, 
b- l the women did enf dinner too’. 


This fact about our linguistic knowledge must be represented somehow in 
the logical form, perhaps by copying the overt structure into the position of 
Lhe null structure„ as first suggested by Chomsky 

Sccond, am elliptical structure may itself he understood as containing an 
elliptical structure, as in (35a), which is understood, to mean (Job), 


( 33 ) 

Jack [[corrected Ills spelling mistakes]] before the teacher did 
[e] L ) 3 and ltd did [s] 2 too. 

b. Jack corrected. Ids spelling mistakes, before the teacher did »r- 
Ticct his spelling fliMfofcrj and Ted did OOJTtfl his spelling mis. 
takes ftf/cfnr the teacher did correct his spelling mistakes. 

This suggests tkat copying is a recursive process. The depth of recursion 
do<es not appear to he constrained by the principles of grammar., as shown 
in {3$)- 

Eiaify CLaims that Jack [[corrected his spelling mis takes) l before 
Ike leacke: did [eh); and tkat Ted did [ejj loo) 3 , but Bob doesn't 

[*Js- 

Third, tho invisible structure behaves as though it was reaJEy there, In 
particular, it can induce a violation of obviation, as in the discourse [37). 


{ 3 TIi Aon: Romeo] wants Rosaline to [love him.;) (i = ]J 

Bern: Not any more—now Rosaline wants Romraj to [cl 
([fotK Aim,], i ^ 1) 

In Ibis example, Ann's use of the pronoun Aim is most naturally understood 
is referring to Romeo. Yet v/hen Ren replies, the coreferontiai interpretation 
[t — I ) is no longer possible in Ann’s statement, These facts of invisible 
obviation are difficult to explain unless the overt structure is in fact copied 
in Lite syntax, as illustrated in (3-5), where the obviation violation bet ween 
Aim L and Romeo v has bee-n made explicit by copying the overt VP love Aim 
into the position of the null VP. 


(SS) Ros-aline wants [Romeo v to lot* hnn.ij 

The iEvlsihlci structure is not merely an invisible VP-pronoun, (Imply be¬ 
cause the- obviation vioiiilkjn in (SlJa) vanishes, when an overt pronoun is 
used instead in (3&b). 



b. 


Juliet] thought the Friar; [prisoned feet]] without realizingthat 
she* | did [ej, 

Ju!iot| thought the Fri&r 2 [prisoned her^j without realizing that 
she] did itj. 


F&iirth, corresponding amphoric. elements in the overt and invisible struc¬ 
tures may be understood as having different antecedents^ as :n {40}, where 
i he invisible pronoun his is ambiguous, referring cither to Felix {'invariant' 
interpretation) or flfn! {'warlanE. 1 interpretation). 5 

{40) Fdixi [bates bisj neighbors^ find so does Max; [c|, 

{^ntes ncip^Sors]) 

This suggests that an anaphoric dement may be linked to {that ls. related 
to) Its antecedent either before the overt structure is copied, resulting in the 
invariant Interpretation (41}, Of after , resulting in the covariacit interpreta¬ 
tion (42), 

^ ^ ''a, FdiX) [hates hisi neighbors] and SO does Mas^ [e]. 

b, FuILk] [hates him neighbors! and so does Max; [hate his! neigh¬ 
bors]. 



b. 


FellKi [bates his neigh hois] and so does Max a ;c!. 

Felix] ihates his l neighbors; and so does Max; [hate Lift? neigh- 

baia]. 


Fifth, the invisible pronoun must agree with its antecedent, which excludes 
the COVariant interpretations in {43) that arc possible in the rrdnjffialiy dif¬ 
ferent examples in (44). 


(43 i 

' a. Barbara! read her L book and Eiicj did [ej too- 
([need htrif^ fcoot]) 

b, Wei ate oijf] vegetable? and so did Boh^ [ej. 
([□ic oerjy.j uepeiaifesj) 


'44) 

a, Barbara i road hen book and Kate? did [e] loo. 

((rcooi Aer t / 2 fiooil) 

b. Wt-| ate our i vegetables and SO did they; ft] - 

([nfc ouri/2 t'cgcfoWcs]] 

In ELrh ExainjtlE. rw-fiil JlctTilicm mini lie pud. lo :li« rrkvaiit cumlrual oE the r.tiL 
structure. indi.ca.Led with bracScets. and Lhe i.i-vn dvd crfertTite (if *fl iphonc dvriiviUv, aa 
i ud leak'd in the iUiLriied pusfllbeueaJ toileting the example. 


87 




Sixth, the caraiiant; interpretation is forced when the ant*c*d*nt of the 
anaphoric clement is a qi]uritified noun ptrare (QXP), as shown in {45). 
(Thit is, (45) must mean ttiiXt every boy ate his own dinner; it cannot mean 
that every boy ale every man's dinner. | 

(45) Every man] (ate hisi dinner] and so did every boyj [e] 

([«■* h is,!^ d'innei']) 

Therefore, an anaphoric dement must be linked to its antecedent fi after 
copying when a is a quantified noun phrase. 

To Summarize, we have seen evidence that the overt Structure must be copied 
to the position of null structure in the syntax, that copying is a recursive 
process, arid that anaphoric dements may he linked to (heir antecedents 
cither before or after the copying, and that they must be linked after copying 
when their antecedent is a quantified noun phrase. □ 

I'he earliest account of invariant and covariant interpretat ions in VPE, due 
to Ross (19G7), ]s equivalent to this theory 4,1, because deletion in Ross’s 
deep-structure to surface-structure derivation is identical to copying in the 
surface form to logical form mapping. This model has also bwn proposed 
in recent government-binding literature. See for example, Pcselsky (19452), 
May {1905), Koster (1987), and Kitagawa (1989). 

More generpJly, any lingurstlc theory that represents the meaning of an el¬ 
liptical utterance using devices that can achieve the effect of copy and link 
operations will inherit the complexity of the copy theory 4,1. This is true 
regardless of how that linguistic theory is defined,, how many levels of rep- 
fOsCnLatron it has, Or what they are called. 

4.2.2 Complexity outside A r P 

In this report, we only consider the problem of assigning structural de¬ 
scriptions to utterances, which Ls a trivia] Eubprobiern of the much more 
intractable and less well •understood problem of determining the semantic 
"truth value of a given utterance. The following proof shows that assigning Ft 
complete structural description to a given class of utterances can be ns diffi¬ 
cult as determining the truth of quantified Boolean formulas; the proof does 
not make the entirely Linnourisbing argument that determining the ‘truth 5 
of human Language utterances can Iwr as difficult as determining the truth 


of quantified! Boolean formulae. 


Lemma 4,2,1 Tfte anaphora preWem is PSPACE-hard, 

Proof. By reduction from QUANTIFIED 3SAT. The input ft Le a quanti¬ 
fied Bwjlcan formula in prenex 3-CNF, comislLng of alternating quantifiers 
Vj!]3jfa r.. Vr« i3ifrt preceding (and quantify Erg the literals in) the clause's 
C\, Cj, ..,, C v in the Boolean variables X|,f j,..., Each clause contains 
exactly three distinct literals Labeled by C\ = (a., V V c*}. 

I'he output is a surface form S and a set .4 of available antecedents., such 
that ill '.111 1 anaphoric elements in 5 have antecedents in A if and only if ft 
is true, tn order to verify that aJi anaphoric elements in S have antecedents, 
we must construct the Logical form of S. The reduction uses one binary 
agreement feature to represent Literal negation, and one r-valued agreement 
feature {or equivalently, log z n binary agreement features) to identify the n 
distinct Boolean variables. 

The idea of the proof is to mimic the structure of ft with linguistic con¬ 
strue tio-n&j by reducing the quantification of variables in ft to the Unking of 
pronouns in S. Each quantifier Qx in ft will correspond to a pair of avail¬ 
able antecedents in S, one to- represent * = U and the Other to represent 
j s 1. Boolean literals in ft will correspond to pronouns in S, As shown 
in figure 4.3, the surface form 5 Is built from three distinct components: 
universal quantifiers, existential quantifiers, and Boolean clauses. We will 
now motivate each of these parts in turn, using intricate yet still natural 
English senten res. 

The first step is to simulate a universal quantifier. Recall that a universally 
quantified predicate [Vx,- J P(x,;)| Is true iT and Otlly if fP(ti = 0) A = 1)J. 
The latter Boolean formula can be expressed in a VP-elhpsis construction 
whose surface form Is abstracted in (4G), 



Fiftciie '1.3: TJrc surface form S that corresponds to cbe Input inst-ance ft = 
VclJis . - .Vt q _|S stjh[<?],, Cs,. .Cj ,]. The quantifier constructions contain two an¬ 
tecedents to represent the two possible truth assignment* u? the quantified variable, 
Eatii ticijijcrsrd quanlifier Vx, ie rap resented by a VP-eLlipsis template, In the Logical 
form that cor responds to S. each of the n/2 circled overs VPs Is copied, to its cor¬ 
responding ellipsed VP position [vpf], according tu the copy-and-link theory 4.1. 
Eath existential quantifier 3x, +5 is represented by an rxtraposed strung crossover 
template, as discussed in the text. Each clause C } is rap resented by a pigeonhole 
construction that contains throe pronouns. one for each Literal in C 7 , one uf these 
pronouns fthe seiecifs’pronoun) must Linlt to an antecedent outside that construc¬ 
tion, in sonm dominating quantifier ctwistruction. These obligatory longdistance 
links are drawn with dashed arrows. The selected pronouns represent the literals 
that satisfy the claimes 


m 














( 48 } & 



S and bo do 5 



QMP VP, QPSP VP 

I Ax I I 


!x,-0| V 


According 5,0 the copy-and-Link theory 4,L the User’s knowledge 

of die eonstrur.tion M<3) la represented in the abstracted logical form (47). 
First, tiie urerl VP is copied to the position of t he mail VP, Next* pronouns 
inside the original and copied VPs link to their antecedents Independently. 



(47) 


The VP is used, by the reduction to represent the Boolean predicate 
the embedded pronoun represents a true literal of i, inside F: the two 
QNP subjects represent the truth values xj — 0 and x,- = ]„ respectively. 
Each p z must linh to the subject of it* wn Mtijunct 111 the logical form,, 
because the subjects are quantified noun phrases. Therefore the pronoun 
Pi in the first VP may only link to the finst subject [qnp *i - 0], which 
represents the conjunct .P[xi — tlj,and the pronoun jj, in tlic second (copied) 
V'P may only Link to the second subject qjyp Xj = 1]> which represents the 
conjunct P(xj — 1), As shown in fljpare 4-3 above, the verb phraat will also 
contain, the construction (46) that representa the next quantifier 3si +l , 

The second step is to simulate an existential quantifier. An existeuti aily 
quantified predicate [3x^+|P(jd+] )J is true if and Only if = Q) y 



91 





Pl> l+L = l)]. Thelatter Boolean formula, can be expressed in it construction 
whOM surface form is (4&), 


im 



5 



This Structure will have two possible meanings, as represented by the two 
Logical forms in (4!}): 


(49) *] 



The embedded sentence represents the predicate jPU 1+ ,); the embedded 
pronoun p l + L represent a true literal of 1 , 4.1 inside the predicate P; the 
two NPs represent the truth values j,- +l — o and $, +1 = 1 } respectively. 
Linguistic. constraints to be disdosed below ensure that p i+] cati only he 
linked to one of the two noun phrases, and that pi + c can he Linked to the 
first NP [np jc,+] = 0( if and only if F(x l+ | - 0) is true, as shown in (4&a)j 
and that p l+1 can be linked to the second NP i 1+1 - l] if and only 
if = I) is true, as shown in (49b). The embedded danse will also 

contain the construction (46) that represents the next -quantifier as 

shown in figure 4,5. 

In order to ensure consistency of truth assignments, alL embedded pronouns 
representing tme titerids of x ;+ , must link to the same antecedent. This 
constraint may he enforced using the powerful strong crossover configuration 


02 



introduced in the previous section. The details of how this might be done 
arose from disetiiSUMi with Alec Marantz, who suggested all the axaraples. 

Recall that strong crossover is the configuration where an anaphoric el¬ 
ement o intervenes between a displaced wh-phrase and. its trace, and c- 
comm studs, the trace. In such a configuration, a obviates the subject of the 
wh-phrase, 


'a. Whoij did lie.*. say Mary kissed f*. 
b. [the man] l [who*. he.j likes f*|, 

The noun phrase in (.'S-Ob} contains a relative clause 'suho he like* (j that 
predicate* ; ?Ae utem]; the pronoun Ac is in a strong crossover configuration, 
and therefore cannot refer to [ifo man], which is the subject of the relative 
clause. 

Mow consider the effect of extripodnE. a relative clause containing a strong 
crossover configuration in {hi). 



At the airport, a raan 3 met Jnne 2 , w]iOfr = i/«a shej likes t*. 
At the airport, a man 3 met Janej, whot=.,yj hCi likes (a. 


In ('SI a), If we understand she. as referring to then we must understand 
ti'^o as predicating g men. Conversely, if we understand he as referring, 
to 11 man in (51b), then tvhn must predicate Jane, This simple example 
establishes the ambiguity of predication when the predicate is p.n extrapnsed 
relative clause containing a strong crossover configuration. 

When th« exlraposed relative clause contains two obwiaAiw pronouns, as 
in {52), then the sentences cannot have the intended interpretation because 
the relative clause must predicate a subject, yet cannot without violating: 
strong crossover. 



*Al the airport, a manjj met Jane 2 , who* shCi thinks kei likes 

4 ^ 


b. *At the airport, (a manjj met Jane 2 , who* he L thinks shej likes 
4. 


This example establishes that the strong crossover configuration gives rise 
to inviolable obviation between the wh-phrase and all embedded pronouns 
that c-command its trace. 


93 


Now v/6 have out construction: 

(53) At the airport, NP,j met ^F ]p [who-t ., , a.* , , , i t ], 

Ar before, two possible antecedents and [S'P; represent the truth as¬ 
signments if+i = 0 wad, Ji +J = 1^ respectively. Pronouns in the embedded 
clause that rrprtkut true negative literals of ij +] ■ran only link to the ‘false 1 
noun phrase MPa’ pronouns that represent true posi tive literals of ± 1+l can 
only link to the ‘true 1 noun phrase NP ]r Observe that the relative pronoun 
ttfAo* may predicate either NFn, or NT( In the example (53), The strong 
crossover configuration ensures that all anaphoric dementi a in the esara- 
posed relative clause obviate the subject of the wh-phrase tuAojfc, Therefore, 
once the ambiguous predication relation is determined, pronouns represent 
ing literals of Zi +| must all be linked to the same antecedent because {ij the 
pronouns must all obviate the predicated noun phrase by strong crossover 
and (ii) there is only one-other permissible antecedent by construe tin; . This 
exactly Corresponds to assigning acousisteiU truth value to ¥t +k everywhere. 

The third and final step of the reduction is to simulate a Boolean 3-clause 
Cj =- (Oj V hj V cy) using the pigeonhole principle, A Boolean clause C f is 
true ii arid only if one of its literals It truer let us call the literal Lbai sat¬ 
isfies the clause the severed h'femf. Only selected Literals need be assigned 
consistent truth values; flonsdected literals simply don't matter, and can 
receive any arbitrary inconsistent value, or noon at ail. We have been re¬ 
dyeing the quantification of variables to the binding of pronouns, and so 
must now represent eadi literal in Cy with a pronoun. For each 3-dau.se, 
the reduction builds a sentence that contains three disjoint pronouns and 
oniy two possible antecedents. At least one of the pronouns must be bound 
by an antecedent outside the sentence—this pronoun represents the selected 
literal. The following English sentence shows, how this works: 

( M ) [a It he student] thought [the teacher] said that 
fbe.j introduced her 6 to him*]) 

Only two neutral uttttdenbs [iAe student] and [iAe tcocAcr] are locally avail¬ 
able to the three obviative pronouns- 6e fll hti\, and in this construction. 
Therefore at least one of these three pronouns- must be bound outride the 
sentence, by cure of the noun phrases in some dominating quantifier con¬ 
struction [either (46) or (4$)), This selected pronoun corresponds to a true 


literal that Satisfies (lie clause Cj. Agreement features on pronouns and 
tiieir antecedents ensure that a pronoun representing a literal of or,' tan only 
link to an antecedent representing the quantifier of n,-. 

Note that this construction is contained inside fi/ 12 VP-deletion construe 
tions In the surface form of the entire sentence S, and, that therefore the 
corresponding logical form W]]J contain 2 It, J copies of each such construc¬ 
tion, each copy with its own selected pronoun. {This corresponds to the fact 
that different literals may satisfy a given quantified danse, under different 
quantifier-determined truth assignments.) The verb phrase that appears in 
our English example (fi4) as [he ttjfnxftHted her io ftimi will immediately con¬ 
tain the construction representing the next Boolean clause C J + i, as shown 
in figure 4.3, 

The pigeonhole construction representing Cj JS permissible iff all of its logi¬ 
cal form copies am a|J permissible, which is only possible when Lhe Boolean 
clause C } contains a true literal for any possible- quantifier-determined truth 
assignment to its literals, its represented by the dominating quantifier con¬ 
structions (either (4b) or (48)) r Therefore, the logical form for the complete 
surface form ,5 is permissible iff the quantified Boolean formula 17 is true. 
□ 

Note th&l the constructions used in this proof to represent existential quan¬ 
tifiers (48) and Boolean clauses (fid) can be combined to give a third direct 
N'f^-hardness proof fur the anaphora problem, where each pronoun is no 
more than four-ways ambiguous and no elliptical contests axe used. Such a 
proof requires significautly fewer agreement features than used in the proof 
of lemma 4.1.1. 

The epilogue to this proof is a demonstration of how the preceding reduction 
might concretely represent the QBF formula Vsr3iy[(y VT V Jr), (i V z V y}j in 
an actual English BfttttfflK*. 

There are two minor difficulties, that are entirely coincidental to the English 
language: the English piurai pronoun they is unspecified for gender; there 
are no entirely neutral ATgumCKlta in English, that can be lhe antecedent of 
any pronoun. Rather than construct our example in a different language, 
say Italian, let us make the following allowances- To overcome the first 
difficulty,, let the jfo he the masculine plural pronoun, and (fiejft the feminine 
plural pronoun, To overcome the second difficulty-, we observe that & plural 
pronoun can always have a split antecedent, as in example (fifi), and that 


tilt- condition of local obviation hold* between they jiCl-d ftfnir Thai is, (ftey 
and hifH cannot share an antecedent ■when they are locally ohviatlec. 

(65) John i -suggested to To411 } that they^jj hQ! hiBI*,^ to Leave. 

We ^ill use split antecedents heln-w. 

The given formula has two variably t and y, which we will identify via the 
plural /.singular number distinction: plural pronouns represent literals of r, 
while singular pronouns represent litorals of y. Negation will be represented 
via the tho masculine/feinifflibo gender distinction: masculine pronouns for 
negative literals, feminine pronouns for positive literals, These correspon¬ 
dences are summarized in tho table: 

“ ^ i-* 

j i—» y *“+ shei 

\ ho constructed SfrUtcnce consists of four components: 

* The V’P-dlipsis construction (4b) to represent Vz: 

|5S ) [[ne\, SOtne stewards] [ V p say ( 5 . ... j]] 

and So do [[np L frame Stewardesses] [y-p *]] 

* The evtraposod strong crossover configuration (53) to represent 3y: 

(37) [ s at the airport [ a a KGB man] met [i JaneJ, [ s who* [... 4] 
and [... 4]f 

* The pigeonhole construction (54) to represent iy V t V $.) using split 
antecedents. 


(3^.) [s (he officer-., the agent, and the mechanic suspected 
(hEp expected theme tp talk to herj]] 

There are three locally available antecedents, ail singular and unspec 
ifiied for gender. 'The three pronouns In the embedded clause are- ob- 
Viative, and require at least four singular antecedents. Therefore, at 
Least one of the pronouns must be linked to an argument outside the 
construction (58-}. 



# A second pigeonhole construction to represent (z V x V p), again using 
£-pLLt antecedents, 

{S&j the crew, the pilot, and the co-pilot knew ftbityi traded thcmn 
to hetijj 

There are three locally available antecedents: one Ls plural neuter {iAc 
eneie), and the remaining two are singular neuter. The three pronouns 
In the embedded clause are obviative, and require at least one plural 
antecedent and three singular antecedents. Therefore, at least one of 
the pronouns must be linked to an argument outside the construction 
(58). 

The resulting sentence, in all its glory, is: 

(.611) [[wPp some stewards) [yp say 

[s at the airport a KGB man] met [i Jane], [51 who* 

[g the officer, the agent, and the mechanic suspected 
[hep expected iheuLo to talk to herj about C 1 11 
and 

fa the crew, the pilot, and the co-pilot knew 
flhej't traded theing, to herj for f*]]]|l| 
and, so do 

f[.\T| some s lawardesses) [y p ej ] 

This concludes the presentation of the lemma 4.2,1, 

4.2.3 Ellipsis reconsidered 

In the previous section, we proved that the anaphora problem is FbFACE- 
hard. The thesis we are defending states that language comprehension is 
NF"complete. Therefore, the thesis predicts that there is a defect in the 
linguistic, analysis that led to the FSPACE’haHaeM result, Tim thesis also 
tells Us exactly where to look for the defect: we must reexamine that part 
of the analysis that allowed us to simulate a computation oucaide of A f V. 
In the case of a reduction from QBF, the defect must he 1:1 that part of 
the analysis used to simulate the unnaturally powerful universal quantifier, 
Therefore. Set us reexamine the copy theory 4,1 of ellipsis. 


97 


A copy operation naturally makes two predictions- neither holds. 

The First prediction k that the original (overt) structure and its. copy wilL 
obey the same post-copying Linguistic constraints, including agreement and 
the linking conditions. (If agreement and the linking conditions did not 
apply after copying, then it would always be possible to vacuously satisfy 
'hose conscrajuts. simply by postponing all linking until after copying had 
applied. Therefore, agreement and the linking conditions must apply both 
before and after copying.) This expected post-copying equivalence is vio¬ 
lated. Although overt pronouns must agrae with their antecedent uu gender 
and number I. SI a), copied pronouns can disagree with their antecedents &s 
in (61b): 6 


16 i'' 

a, Torn* read hLSijf-a hook and Barbara; read hi-St/*; book {too), 
b- Totti] [read hiSi book| and HarbAru^ did ej too. 

[(read Ai.ij/j 6coit|) 

Moreover, although overt anaphora must have local antecedents in (62a). 
copied anaphora need not, as shown in (G2b)l 


^ ^ The prisonen shot himsejfy before [the executioner; could shoot 

himself*,^]'. 

a- Tne prisoner [shot hlisise[f[] before the executioner} could [cl. 
(fsfioot djVrtsd/i /}]) 


The second prediction is that processes that apply after copying, such as 
linking, ‘will apply independently In both the original (overt) structure and 
its copy. This expected post-copying independence is also violated. In par¬ 
ticular. baking is not independent lil both the Original structure and its copy, 
as shown by example (63), which is incorrectly predicted to have iivs read- 

'■ TbeiliiiiciiJly-n.tcitlht[iLivsiaecfivitri*ifH JiUerprelitinn for Bor^rmi ™d * rr , imiJr rtrirf 
J “ ;nc c,mjT * ar ^*1 dl# v tgci.-ihU-i 4jm.j sc- did Hah, dote iidL vre-iit-n my criticism 
a( the copy theory 4 . 1 . My uriurijm is based on the arcessiLy of discriminating (Glad 
aad (Gfb), which the copy theoty is unable If. do. Id oidej to aoj^Ujii for Lbe pcKFuble 
[tn'sri.auL'ucavjLriafU. contrast between. masc-uSin« Ap.ii femLaine piQli-Ouas, wn «ug£eai in 
appendix H that the theraaLit-pwitiaa assigned to KMne anaphoric elejornti n will inherit 
the agfe-emerit features of a. and in these cases a must a*K*e with both of its antecedents. 
An aJlernMt approach, to ¥B y thaL he Li the “delimit bound variabLe,” would incorrectly 
“ogees' that the (wuii«t interpretation nf she i* never aviifahk, 



mgs [two when linking precedes copying:, four when linking follows copying, 
and one overlap)* 

(63-} Bob [introduced Felix io his neighbors] and SO did Mas [e]. 

In particular, there should he a reading where the overt his refers to Felix 
and the null/copied his refers Io A tax. However, this reading ii not available:. 
In fact, only throe interpretations of (63) are attested (two invariant, one 
covariant.), as- shown In {64): 

I' £j A\ 

a. Bohx [introduced FeLix-j to Jils-j neighbors’ and so did EvftnXs [e], 

([Errirntfartd Fc-Jij; to Aiv.i/j/.ri rifijjf.i.&Ofj]) 

h. Bohi [introduced FeLi>:? to hist neighbor] and so did Maxj [e], 

((introduced Fclix^ to nci^h^ors]) 

In other words, a pronoun must link to-the same position in both the visible 
verb phrase and its understood copy. This is not retd copying, but a kind of 
logical predicate-sharing that can always be represented without copying. 

Let us therefore propose the following predicate-sharing theory 4.2 of ellipsis: 

Theory d.2 The. forji'cnf form of eitiptia is construcicd shartnp the same 
itemulie predicate between tht ciiht* and tliipscd sfrucriiofs; obviation is a 
relation betv&en aryumertf positions in a thematic predicate; an anaphoric 
dement majy ftrtfc to cm arjreitpftu or to on anjumenr position. 

Evidence, First, verbs are thematic functions from their direct and indirect 
objects to a verb phrase^ a verb phrase is function from the inflection and 
the subject to a logical proposition, that is realised syntactically as a clause. 
For example, the expression Felix hates vegetables would be- assigned then 
logical fo-rm (03), 

( 03 ) (Ajr,(x hates vegetables]) (AF,[(Felix / , ')]) 

Second, VP-elLipsis is the sharing of one \'P predicate between two clauses. 
One way to represent the logical form of an elliptical Structure to iamb di- 
abstract the VR predicate. For example, the surface form (6fia) would be 
assigned the logical form representation (66b); 


59 


r £| J-? 

'a. [Felix [ale dinner]] and so did [Tom [e]j 

h- (At .!t ale dinner]} (AJP.[[ P Felix) and so did (P Tom)]) 

Third, obviation is A relation between the argument positions in the VP 
predicate, as illustrated in (fi7b) for the surface form (G7a). 


(QT) 


a, fiocneO] Vr-ajLts Rosaline to [tow hlm T |] before wanting, himself i 
to [#]. 

h. to love him^]} 

(AP,[Romeo, wants [(Rosaline P)] hefofe WAntimg [(himself, 

^HJ) =* [* ^ Jl 


This logical form representation accounts for All the cases of invisible obvi¬ 
ation, without an unnaturally powerful copying operation, 

Fourth, an anaphoric element may link to an argument directly (6SIT|, re¬ 
sulting in tho invariant interpretation, Or indirectly, to an argument position 
in the VP predicate (68c), resulting in the cfw&fianl Interpretation. 


a. ReliXi [hates hJsx neighbors] a-nd so does Max [r] 

b. (Az.fs hates his, neighbors]) (VP,[( Felix; P) and (Max P)]} 

C, hates bis, dinner]) (A P. [(Felix P) wid (Max P)]) 


This predicate-sharing theory 4.2 correctly predicts that the example (63) 
has exactly three interpretations, one for each of the three possible verbal 
predicates shown sit (G9), 



b. 

e. 


{Ar,.r introduced Pells to Felix's neighbors]) 
(AP-[(P Bob) and (P Max)]) 

(A*.[* introduced Felix to ± ? s r.erghbors]) 
(AP.[(P Bob) and [P Max)]) 

(Ai.fz introduced Felix to Bob's neighbors]) 
(AP.[(P Bob) andfP Max)]) 


Wliile predicate-sharing is conceptuaLly simple, an rxlensivo investigation is 
needed to confirm Such a linguistic theory. This.is the task of appendix B- 


100 


The predicate-sharing theory 4.2 gives us die upper bound predicted by the 
complexity thesis; 

Theorem i0 The. anaphemi problem ts m A' P for nonelliptical structures, 
arid for elliptical structures i cith predicate-sharing. 

Proof. Covert arguments in a slrunture are either coiefereniial with an 
overt argument in the structure (for exampk. control PRO or wh-tra.ee). In 
which case the they may he coalesced with their overt antecedent, or they 
are assigned an arbitrary interpretation {PRO„ti), in which case they do 
not participate in the graph of referential dependencies and may be ignored 
entirely. Therefore, the- number of obviation relations is at most quadratic in 
the number of overt arguments, an upper bound that is obtained in the case 
of a complete obviation graph, Th^ logical forms licensed by the predicate- 
sharing theory 4,2 are nearly the same size as their corresponding surface 
forms, because we can always lambda-abstract the shared predicate, if the 
Structure is elliptical, Otherwise, the structure is nondliplic&l and logical 
uud surface forms are the same size, because operators that map surface 
forms to logical form:-., such as quantifier scope assignment, do not increase 
the number of arguments and therefore cannot increase the size of the graph 
of referential dependencies beyond quadratic. Next, each anaphoric element 
is nondetemjinjsticaily linked either to an argument in the set A of Wlllabto 
antccedenti;, or to an open thematic position. Clearly ibis may be done 
in non deterministic polynomial time. Finally, we chock that the linking 
conditions irr satisfied, including invisible obviation. In deterministic time 
proportional to the number of links, verifying the semantics of obviation by 
propagated ^referential value” markers along the links, checking for cyclic 
dependencies, Mid SO forth. Q 

Theorem II The anaphora problem is NP-compteic, 

Proof. By theorems lfl and 3-- □ 

As we saw above in section 4.12-3, and again in greater detail in appendix H, 
the predicate-sharing theory is strictly superior to-the copy-aad-Uuk theory, 
That is, the predicate-sharing theory assigns better structural descriptions 
to the class of elliptical utterances than the ropy-and-link theory does, and 
no utterances are assigned better structural descriptions by the copy-and- 
licik theory. However, the significance of the predicate-slicing theory goes 
beyond merely the number of linguistic examples correctly reclassified. 


Recad] that our central scientific goal is to understand the comprehension, 
production, and acquisition of human language; generative theory is- inter¬ 
esting only in so far as it advances this goal. The solution to a PS PACE- hard 
problem may he exponentially large in the size of the problem Statement. 
flnlLhe problem in J\i'P, PS RAC E-hard problems do not have efficient wit¬ 
nesses. An efficient witness is a short correctness proof far a solution, fn 
the case of the anaphora problem, a permissible graph of referential depen¬ 
dencies serves as the correctness proof.) If anaphora comprehension were 
PSPACE-hard, as it is according to the copy-and-link theory of ellipsis, 
then the mental representations required to produce and comprehend dhp- 
tic&l anaphora won id be infeasible large. Language users could not even 
comprehend the utterances that they themselves produced. And the gen¬ 
erative theory of anaphora would not yield a plausible account of language 
comprehension and production. 

But by reducing the complexity of anaphora from PS PACE to .VP, we prove 
that the anaphora problem has efficient witnesses, and in turn show that gen¬ 
erative theory is the basis of a plausible account of anaphora corr.prehenSLun 
and production, 


4,3 Analysis of linguistic theories 

It Ih informative to control Lhe approach of this report, the direct analysis of 
human language, with a related approach, the analysis of linguistic theories. 
In tin; latter approach, we study the theory-internal computational problems 
posed by the theory, such as “compute this predicate defined by the theory,” 
or “ensure that this constraint of the theory is satisfied,Chapter 1 has 
examined the computational structure of generative phonological theories in 
some detail]. This approach is also exemplified by Ginrgi, Pianesi, and Satta 
(1989) in their- complexity investigation of the binding theory of Chomsky 

(lflW), 

The central danger of such an approach is the risk of irrelevance, which in¬ 
creases whenever we lose sight of the computational problems of language 
comprehension, production, and acquisition. Different theories talk about 
vastly different things, and hence jt is Impossible to achieve any kind of 
invariance with respect to either phenomena Or theory. Moreover, the com¬ 
putational properties of even the perfect linguistic theory have at best an 
indirect connection to the computational properties of human language 


1D2 


The central computational problem posed by nil generative linguistic theo¬ 
ries, of which all theory-intern a] problems are snifoprobiems, is to enumerate 
all and only the possible structural descriptions (that is. possible linguis¬ 
tic representations). That is, linguistic theory Itself poses a computational 
problem, the problem of constructing the rtb representation In the enumer¬ 
ation, given the index of enumeration i. (Equivalently, we may think ofi 
the encoding of a possible linguistic representation, that must be verified by 
the linguistic theory as being permissible.) As elaborated in chapter 2, the 
computational problems of enumerating or verifying representations have 
al best an indirect connection to human language, which is the process of 
constructing structural de-serlptious of evidence. 

Even worse, complexity analyses of linguistic theories art likely to be ir¬ 
relevant. For example, consider the many different theories of referential 
dependencies. They arc stated in vastly different terms: as constraints on 
syntactic or discourse-level representations, in terms of the goads and inten¬ 
tions of speakers and hearers, or even in terms of the objective ^meaning" 
of utterances in relation to the external world, Let us examine three very 
similar theories of anaphora, that, nonetheless have widely divergent com¬ 
plex! ties. 

One approach requires all referentially-dependeni elements. Including pro¬ 
nouns, io have Linguistic antecedents if at all possible (Higginbotham 1983, 
or at least its spirit). Thu theory-internal computational problem posed by 
Such a theory is to-link every pronoun to an argument, subject to conditions 
of obviation Mid acyclicity. As proven in this chapter, the decision problem 
posed by this approach is b’P-complete. 

A second approach postulates a condition ofobi'ialron combined with free in¬ 
dexing of all arguments (Chomsky 1936). The corresponding theory "-internal 
problems posed are (i) to decade if a. given indexing is permissible (verifica¬ 
tion) and (ir) to ensure that pronouns may indexed without violating the 
obviation condition! (satisfaction). The verification problem js dearly easy, 
requiring US to Simply compute the obviation relations from the phrase struc¬ 
ture and check that each pronoun in assigned a, different index than any of 
the arguments that it obviates. Because obviation may always be satisfied by 
trivially assigning a different index to every pronoun, the decision problem 
for satisfaction requires constant time, that is, always answer YES. 

A third approach only handles cases -of so-called bound anaphora, where a 
pronoun is interpreted as a variable bound by a ■quantifier, as- in leuery man) 


103 


afe die rfmnrr (Ranhart 1983). The theory-internal verification problem 
posed is to ensure- that every pronoun interpreted ns a bound variable Is 
c-CQtn minded by a natural quantifier, The problem of cheeking an existing 
structure is efficient, requiring time proportional to '.tie size of the stfuc- 
Ulfe- However, even when every pronoun is required to have a linguistic 
antecedent, no pronoun need ever be interpreted as a bound variable, wnd 
hence the corresponding decision problem for satisfying Reinhart's theory 
only requires constant time. 

Rut the theory-internal problems corresponding to the second two approaches 
are of no independent interest, being entirely irrelevant to human language. 
In studying the computational structure of human language,, the only rel¬ 
evant problems are language comprehension, produclion, and acquisition. 
T.ie computational problem posed by pronouns in the act of com prehension 
is to compute their antecedents using no new information, If language User 
fails to do this, then lie has failed to comprehend either the pronouns or 
the utterance Lh&t contains lhem, The fact that a language user can fail to 
comprehend an utterance in constant time by assigning it an inadequate or 
incomplete representation le of no interest. 

E-ven if our sole interest is in the computational structure of Linguistic theo¬ 
ries, then we should still study the complexity of LC problems. Studying LC 
problems allows us to more easily compare linguistic theories, and to study 
the complex interactions among the different part? of a linguistic theory, 
Either a particular LC problem is posed by a particular linguistic theory, or 
it is not, .f is not posed by the theory, then that theory is empirical]v iis- 
arlcquale. and we understand exactly why and how the theory is inadequate. 
Otherwise the LC problem is posed by the theory* and no matter how it is 
posed, by the particular theory—no matter how it is disguised or carved up 
into different parts of the theory, whether in phonology, syntax, discourse.-, 
pragmatics, semantics, or what have you—then that linguistic theory inher¬ 
its the computational complexity and structure of the LC problem, This is 
heraasc complexity theory classifies problems., not algorithms or particular 
wavs of wiving those problems, As long as a linguistic theory poses an LC 
problem, the problem of assigning representations to utterances according 
to that theory is at least ax complex as the LC problem is. 

This is well-ill nitrated by the anaphora problem studied in this chapter. 
As long as a particular theory of language has an empirically adequate de¬ 
scription of the observed facts of obviation and antecedence, then the com- 


104 


prehension problem for that linguistic theory inherits the structure of th* 
anaphora problem eXami&#d here. Tliis is true HO matter how this descrip¬ 
tion is couched.. whether in terms of constraints on a syntactic relation of 
toindexing or linking., In terms of syntax or discourse, in terms of speaker- 
hearer intentions or other pragmatic considerations* or eWfl i:i terms of a 
Montague-like com positional theory of semantic types. If the theory pro¬ 
vides an empirically adequate description of the language user's knowledge 
of utterances, then it will inherit the inalienable computational Structure of 
that knowledge. 


105 


Chapter 5 

References 


Baker, Mark, 1085, Incorfiomiion: a tJtcvry of graTriTnaitcai functian chang* 
mg, Ph.D dissertation, MIT Department of LLil^u iRties- ami Philosophy. 
(Published by University of Chicago Press, 1087,1 

Barton, Ed, Robert Berwick, and Eric Ristad. ]057. com- 

pj'cxEl^ and rtu^Kmi language. Cambridge: M3'T FfCSi. 

Beattie, James. 17-88. The JVjcorp of Language. Menston, England:: The 
Scolar Press Limited. (English LinguiSties 150Q-1BM Collection., No.S8,) 

Berwick, Robert, and Amy Weinberg. 19-34- Thr gra-mma-facci of ftj?- 
3&i9tic performance: language vsl on if cajiHsifion. Citmbridge: MIT 
P cess. 

Browne. Thomas Gunter. 1705. Hermes Unmasked. MenetOfi, England: 
The Seol&r Press Limited. (English Linguistics 1500-1900 Collection 
No,SOI.} 

Buttle ( Luigi. 1B8U. ‘‘On the non-existence of disjoint reference principles." 
Colloquim, LSA Meeting. December 1&S8. 

Burao, Luigi. 1080. "The roJe of the antecedent in anaphoric relations.” 
Second Princeton Workshop on Comparative Grammar, Princeton Uni¬ 
versity, April 27-20. 

ChiJldrl. A., D. Kozen, and L-. Stocbmeyer. 1081. ,L Alternation.* J, ACM 

28(1 ):1 Id - 133 , 


100 


Canny, John F. 1988, The complexity oj robot motion pJhnntFHj, Cambridge: 

MIT Press. 

Cilia!, jhulamuih. 19$&, “Children's pronouns-* in U. Wiedemann (ed,). 
19S6, pp.381-404> 

Chomsky, h'oam. 1951 . Morphophonemics of modem Hebrew, M.A, disser¬ 
tation, University of Pennsylvania. {Published with revisions, Garland 
Publishing, New York, 197$.) 

Chomsky, Noam. 1955. The logical .structure of linguistic theory, mimeographed. 
Harvard. (Published in part by PScnum Press. 1975), 

Chomsky, Noam. 1956. '"Three models for the description of language, - " 
LR.E. ThmssJtims on Information Theory*. Vei, IT-2 t pp, 113-124, 
Reprinted, with corrections, in R.D. Lute, R. Hush, and E. Gal ant er 
{eds-h Readings in Mathematical Psychology t Yol II. New York- Wiley, 
1965. 

Chomsky 1 Noam. 1965. Aspects of the. theory of syntax. Cambridge; MIT 
Press. 

Chomsky, Noam. 1963. Language and" mind, New York: Hueourt Hrace 
Jovanovich. 

Chomsky, Noam. 1930. On binding;, 11 Ii"n£uijl!t£ Jnjniry 11 (1 }il —4&. 

Chomsky, Noam. 1980. JfciJes and repnfsemriif torts. New York; Columbia 
University Press. 

Chomsky, Noam. 1981. Lectures on yoremrnmt arwf binding, Dordrecht: 

H0ri£ Pttbliutiotr. 

Chomsky, Noam. 1982. .Some concepts and of the theory vf 

government and binding Cambridge. MA: MIT Press. 

Chomsky, Noam. 1986- Barriers. Cambridge: MIT Piess, 

Chomsky, Noam. 3986. Knowledge of language: its origins, nctunf, and 
use. New York; Praegcr Futdishers. 

C horn Sky, Noam, 1989, 11 home notes on economy of derivation, and repre¬ 
sentation” MIT Working Papers in Linguistics-, vaLl'O, pp. 43-74. 

Chomsky, Noam said Morris Halle. 1963. The Sound Pattern of English, 

New York: Harper & Row, 

Clements, George apd S- Jay Keyscr. 1983. CV Phonology; a generative 
theory of (he syllable. Cambridge: NUT Press. 


107 


CordGfflCy, Gerand de. ltifiT, A Philosphical [Hscourse Concerning Speech. 
Conformable to the Cartesian Principles. 

Curry, Haskell B. 1361. "Some logical aspects, of grammatical Structure, 11 
jProcceiftn^ of Synijmio m Applied Mathematics, void me XIT, pp. &G- 
66 . 

Dahl, Gsten. 1372. ’'On so-called ‘sloppy identity 1 .'" GoiAc Papers in 
Theoretical Linguistics 51, University of Goteberg., Sweden. 

Dahl, Osten. 1574. “How to open a sentence: abstraction LSI 13 At Ural Lan- 
gUAge. 11 Logical Grammar Reports 32, University of Gdlebcrg,, Sweden, 

Finer, Daniel. 1964. The. formal grammar of switch-reference. Unpublished 
Ph D dissertation, University of MasEachuseLts. at Araherst. 

Fodor. Jerry. E9S3. The modularity of mind. Cambridge: MIT Press. 

Fukui* Naokj, 1986. A theory of category projection and t is applications, 
Unpublished PhD dissertation, MIT Department of linguistics and 
Philosophy. 

Gaidar, Gerald, E^an Klein, Gecffiry Piitlum, and Ivan Sag. 1085 , Genet - 
allied phrase structure grammar. Oxford, England: Basil Blackwell, 

Giorg). Alexandra* Fabio Planes!. and Giorgio Satta. 198-9, u The com¬ 
putational complexity of binding theory's satisfiability and verification 
problems," Trento, Italy: unpublished manuscript, IR$T. 

liaik, Isabelle. 1907. "Bound VPs that need to be." f,['rc$ai,s£(c.» ana' Phu 
losophy 10:^03—330. 

dale, Kenneth. 1982. "The logic of Dajnift kinship terminology,” l:l Oceania 
Linguistic Monograph r*o.2J. The Languages of fiinthip rrr A fr^riiji rift! 
jtlMJmfta., ed. by Jeffrey Heath, Francesca Merten, and Alan Ktimsey. 

Halle, Morris. 19flE. “On the role of simplicity in Linguistic desr-riptions. 1 ' 
Proceedings of Symposia in Applied MatAentflltCJ, Volume XII, Structure 
oi Language ar.d its Mathematical Aspects, pp. 89-94. 

Halle, Morris. 1962- “Phonology in generative grammar." Word ISfl- 
2)34-72. 

Haite, Morris, lflfJA. “Speeulatbn* about the representation of wards in 
memory." In Phonetic Linfni.dics, Essay* m Bono r of Peter Ladefoged, 
V- FfOmkin, ed. New York: Academic Press. 

Halle, Morris and G-N. Clements, iyd.it, f nsfifem Bott Jfc in Phonology, Cam¬ 
bridge: MIT Press. 


10S 


Herder, Johann Gottfried von, 1827. Treatise upon the origin of fori# stage. 
London: Camberwell Press. 

Higginbotham, James. 1983. “Logjcai form,. binding, and norm rials.^ £m- 
gtiintif Inquiry 14:395-419. 

Higginbotham, J iniw, 1985. "On semantics." Efnjmsitc /ngutry 16:547— 
593. 

Higginbotham, James. 1989. ’'ELudications of meanmg." Linguistics and 
PhtlatOpkg l£(3):4G5-517. 

Higginbotham, James. 1990, “Reference and Control ” ms., Department of 
Linguistics, MIT. 

Hopcroft, John and Jeffrey UHman. I!)79. Introduction to Automata Theory. 
Languages, and Computation. Reading, MA: Addison-Wesley. 

Jakobson. Roman and Morris Halle, lOo'C- Fundar.ientuh- of Language. The 
Hague: Mop ton, 

Johnson. C. Douglas. 1972. Format aspects of p/ionolr^gicol description. The 
Hague: Men]Ion. 

Kay at, Richard, 1081. “Unambiguous paths.” Id Levels of syntactic rep- 
rcsentation^ R. May and J- KosfCf, cds. Dordrecht: Foris Publications, 
pp. 143-183, 

Kayne, Richard. 1984. Ctfnncvrtfe.drcess and binary branching. Dordrecht: 
Forts Publications. 

Keenan. Edward. 1971. “Names, quantifiers t and sloppy identity problem, 1 " 

Pnpcrs tre £,tngii2,$ltc.e 4 f 2J[ :2J. 1—232. 

Ken stow its.. Michael and Charles Kisse berth. L979. fJenenjftw; Phonology, 
New York: Academic Press. 

Kitagawa, Yoshk. 1986. Subjects m Japanese and English. Unpublished 
Ph.D dissertation., Department of Linguistics, University of Massachusetts 
at Amherst. 

Kitagawa, Yoshi, 1980. “Copying identity.” ms., Department of Linguistics, 
University of Massachusetts: at Amherst. 

Koopman, H. and D- Sporliche. IE>85. “'Theta theory and extraction.* in 
GLOW newsletter 14:57' 5R. 

Koopman, 11. and D. Sportrche, 1986. “A note on long extraction in Vat a 
and the ECP." iYsiunaf Language and Linguistic Theory, 4(3):35 7-374. 


Koeter, Jan. 1937. Oam.Qiw and dyn attics; the radical autonomy of syntax, 
Dordndit; Foris Publications. 

Kuroda, S.-Y. 1988, “melhoi wc agree or not: a comparative syntax of 
English Md Japanese," Linguistics Inwftigationet, 12:1-47. 

Lambek, Joachim. 1961. il Ojl the calculus of syntactic types.* Proctt-dinys 
of Symposia in Applied Mathematics, volume XII, pp 166-178, 

Larson| likhard. 193-3- 11 On the double object construction. 1 ’ Ztn^ntVfic 
Inquiry 19(3);335-39I. 

Lasnik, Howard. 1976. ^Remarks on coreferer.ee." Itn^imfic /tnniyj-ps 
2(1): 1-22, 

Last'd fc, Howard. 1981. "Treat jnetvt of disjoint reference," Joumai of Lin¬ 
guistic Research l(4);46-58, 

Ls&nik. Howard. 1955- a Oti the necessity of binding conditions,* in IE, JLas- 
nik, Essays o>t HnapAuiKi, Dordrecht: Ktuwer Academic. 

Laanik, Howard, and Mamoro ftaito. J 93-4. “"Oil the nature of proper gou- 
Of ament. 11 Linguistic Inquiry 15:235-260. 

Lawler, E.L, 1976, *A Note on the cortipl^Kity of the ch ro-iri ?_t,i n iiui'tihnc 
problem." Information Froths si ntj Letters 5(3);, Gti—ti 7. 

Levin, Beth and Maika ItappaporL. 1989- “-er oominaJs; implications for 
the theory of argument-structure." in E. Wehrii and T. Stowed, edi-, 
Syntax art of tfot lexicon, New York: Academic Press, 

Levin, Leonid. 1973, “Universal sequential search problems, 11 Proalems in 
Information Tranjfmte.firift 0:265-266. 

Luria, A. R. 1968. The Mind of a Mntmdftvsl; a Utile book about a vast 
memory. New York: Hasit Books. 

Malajan, Anoop Kumar, i960. * Agreement and agreement phrases ." MIT 
Working Papers in Linguistics, vo3.ll), pp. 217-252, 

Maranix, Aicc, 1984. Do the nature of grammatical relations. Cambridge; 
MIT Press. 

Marantz, Alec. I960. ''Projection vs. percolation in the syntax of synthetic 
compounds, 15 ms., Department of Linguistics, (JNC Chapel Hill. 

Marantz, Alec. 1909. "Asymmetries in. double object constructions." talk 
at MIT- 

Marr, David. I960. Vision. San Francisco: W-H ■ Freeman. 


110 


May, Robert. 19S5 farm: tie juncture cirmf ciertuaitcm. Cambridge; 

MIT Press. 

McCarthy, John. 1931. "A prosodic theory of nmwmcateiiaLive morphol¬ 
ogy, 71 LiW0tatsti'e 15:373 418. 

McCauley, James D„ 1967. ^Meaning and the description of l&ftgugta” 
appears in Grnrnmac and leaning, Tokyo; Tsishtikan Publishing Co,. 
1973. 

Minsky, Marvin. 19G9- Co! , nplJJ'aJ , i'of^. , finite anti ip/lmte machines. Engle¬ 
wood Cliffs: Prentice Hall. 

Parlee. Barbara It. 1975. “Montague grammar and transformational gram¬ 
mar.' 1 Ztntjuasiw: /nguirji 6(2}; 203-300'. 

Pesetsty, David M. 1932. Paths drtci eateries. Unpublished PhD disser¬ 
tation, MIT Department of Linguistics and Philosophy. 

Peters, Stanley, and R.W. Ritchie. 1073. the generative power of 

transformational grammar*. 14 Jn/orn'taij'on Science 6;49-B3, 

Picra, Carles, 1035. “Gaps ifi gaps jn GPSGT linguistic /ngiriry 16;6S1- 
683 . 

Plilek. Martin and Petr SgalL, 1976- “A scale of context sensitive languages: 
applications to natural language." Information and Control 3S(l)T-20- 

Pollock, Jean-Vves. 1989. 41 Verb movement, universal grammar, and the 
structure of IP,” Linguistic /ngtuTj 20(3):!Jfi5-4!2:4, 

Pritchett, Bradley L. 1933- “Garden path phenomena and the grammatical 
basis of language processing. 71 Language 64(3):539'-575. 

Raii, John H. 1979. “Complexity of the mover's problem and generaliza¬ 
tions.’ 1 Proceedings of the 20th Annua/ Syrnpfjsium art Foundations of 
CofllptUer Science, 421-427. Now York: EEEE Computer Society. 

Rejf, John H. ami Mioha Sharir. 19S5. “Motion planning in the presence 
oi moving obstacles.* Proceedings of the A nnti#l i’jpnpi^rum on 
Fotindnlioni of Computer Science, 144-154- New York: TREE Com¬ 
puter Society, 

Reinhart, Tanya. 1PS3. -Anaphora ami Semantic interpretatiari, Chicago: 
The University of Chicago Press. 

Rissanen, Jorttia. 1978. “Modeling by shortest data description." Avtovtai- 
ica. 14:465-471. 


Ill 


Ristad, Eric. 19|f&. Complexity of Linguistic Mooch: a computational anal¬ 
ysis ami nsecrtiJruetlon of generalized phrase structure. grammar, S.M. 
dissertation, MIT Artificial Intdlrgcnce Lab. (Portions published in 
Barton,, Berwick, ami. Ristad 19S7.) 

Ristad, Eric. 1988. “Computational compieriky of current GPSG theory,'' 
Proceedings of the 2fih Arcnimf JtfeeHng of ihe ACL, Ke-w York, b’Y, 
PP-3Q-30, June 10-13, 1086, 

B-istad, EllC- 19*8 “Complexity of Human Language Conn.prchens.iori. 11 
MIT Artificial Intelligence Lab, memo fJfii. 

Ristad, Eric and Robert Berwick, 1Q80. :: Computational consequences of 
agreement and ambiguity in plural iMIgUtgesC -Jifltfnwf of Mathemat¬ 
ical Psychology 33(4)^79-396. 

Ross, John Robert, WOT. CfoftSfminfJ on pariohles in syntax, Ph.D disser¬ 
tation, MIT Department of Linguistics and Philosophy. (Published lift 
1QSB as Infinite Syntax!, Norwood HJ; Abktff.) 

Rounds. William. 197$. " A grammatical characterisation of tbt exponen¬ 
tial time languages.” PntKecdirujs of the 16th Annual oympasium on 
.fiuitchirt^ Theory and Automata. New York: IEEE Computer Society, 
pp. 135-143 

Sag, b'&n Andrew. 197fi. De/eftrrn a?id Logical Form, Unpublished Ph.!D 
dissertation, MIT Department of Linguistics ar.d Philosophy. 

Sagey. Elizabeth. 108G. The Representation of Features and Relations irt 
Non-linear Phonology. Unpublished Ph-D dissertation, MIT Depart¬ 
ment of Linguistics and Philosophy. 

Soils, Peter. 10E7. "-.Aspects of logophoridly. T ' jfarifluisf ic In^nii-y lB(3):i45- 
470. 

Sporlicbe. Dominique, 1Q8G- “A theory of floating quantifiers and, conse¬ 
quences,” Proceedings of ts'EI-S 17- 

Spcrtiche, Dominique, lSSS. i: A theory of floating quantifiers send its corol¬ 
laries for constituent structure.” imjwwfje /n^utVy 19(3):435-449. 

Stabler, Edward P.. Jr. 1083. 41 ]Tow are grammars represented?* The 
Behavioral arid Pniin Sciences 6;3&I-42L 

■Stowe]]. 1 jmothy. 1081. flj’i^s'ns of phrase, litrurtnre- Unpublished Pb.D 
dissertation. MIT Department of Linguistics and Philosophy. 


L12 


Turrego, Esther, 1934- “On inversion in Spanish and some of its effects." 
Linguistic Inquiry ]. 5 : 103 - 129 . 

Turing, Alan. 1930- “On computable no mb era with an application of til* 
Entscheidungs-problem” Fwc- London Math. Soc. 1.2(2}:'23U-2;6o, 

von HumhoEdt^ Wilhelm. 1&3Q. t-ssec Variability and /nfcUec-f™! De¬ 

velopment. Translated by George C, Blitk and Ftithjof A, Raven, 1971, 
Miami Linguistics Seises No,9, Coral Gab!*?, Florida; University of Mi - 
ami Press, 

Wasow, Thomas. 1972, “Anaphoric relations in English.” Pll.D disser¬ 
tation, MST Department of Eunguistics and Philosophy. (Published in 
1&79 as Anaphora. in Generative Grammar, Ghent: E. Story-ScLcnUa.) 

Wax, Mali and Kaiiath,, Thomas, 1E1&&, ’'Decenl-raliised proeessiiig in sensor 
arrays. 1 * IEEE Trans. A SSP 33(1);! 123.-1120, 

Wiesemaun, Ursula. ed, 19B6. Pronominal systems. Tubingen: Gunter 
Warn Verlag- 

W'illiams, Edwin. 1977. "Discourse and Logical form. 11 Xirtpuisfac Inquiry 
8:101-139, 

Williams, Edwin. 1978. -Across-the-hoard rule application.* 7.injurs(tc 
Q(l);31-43. 

Williams, Edwin, 1930. “Predication.- 1 ^ Linguistic Inquiry 11(1}:203-238- 

Wllliams, Edwin. 1989. ''The anaphoric nature of 9- roles.* itn^uisUc 
Inquiry £ 0 ( 31 : 420 -^ 00 . 

Van, Andrew. 1988. “Computational Information theory, 1 ’ in Complexity 
trt Information Theory. Yaser Ahu-Moslaia, cd., New York; Springer- 
Verlag, pp, 1-15. 


113 


Appendix A 

Philosophical Issues 


In this chapter, we examine two philosophical issues arising from its research 
described jn the body of the report. Firsts we consider the implications 
of an, ^P- completeness thesis Tot human knowledge. Next, we discuss the 
idealizations to unbounded inputs and unbounded distinctions, which have 
played an important role in the complexity analyses. 


A>1 Implications of a complexity classification 


What are the Implications of placing human language in the abstract hi¬ 
erarchy of computational complexity? And whit is the exact significance 
of proving A r P IQw«r bounds un Language com prehension/production? The 
consequences of out thesis, that language is MP-campJete, are both practical 
and theoretical. 

The central theoretical consequence of the NF-completeness thesis is to off 
fer a broad new (and very different) perspective on human language, where 
things previously obscured now become clear. By placing language in the 
much-studied complexity hierarchy, we bother understand it’s overall com¬ 
putational structure, by analogy to the other equivalent combinatorial prob¬ 
lems in its complexity class. Wo see that language computations art E ot like 
two-person adversary games (PSPACE), nor are they like pointer-following 
(LSPACE) or directed search in a feasible space of possibilities (F), Rather, 
human language is like blind search Lb an exponentially large space (JtfV), 
to find efficient witnesses. 


Ill 


An second (indirect) theorCtjcal consequence of the thesis has been to refute 
the Linguistic theory- Itl ordur to carry out a direct complexity analysis, 
w# must present strong empirical arguments about the facts of Linguistic 
knowledge, Each direct analysis in this report has improved on current 
understanding. In chapter 2, repented complexity analyses led ’is to reor¬ 
ganize the overgenerai architecture of segmental phonology, expose (for the 
first time) the unnatural rule interactions allowed 5u segmental model, reveal 
the importance that the methodological directive “omit direct I v-predir table 
information 71 plays in phonological processes, and increase our understand - 
]ng of the SPE evaluation metric. In chapter d, we falsified the Standard 
qon distinctness theory of anaphoric agreement and elucidated the grammar 
of referential dependencies. Next we- discovered the phenomena of Llfivjsi 
ble obviation, demonstrated the necessity of revising the binding conditions 
accordingly, falsified the widely-accepted copy-and-link theory of syntactic 
ellipsis, and proposed an alter native predicate-sharing cheory that ii Strictly 
empirically superior (&S shown in appendix B). Each discovery arose natu¬ 
rally out of the complexity investigation. 

We may also ask, what are the real-world Implications of the *W lower 
bounds, for natural language parsers and for theories of language processing? 

A parser is an algorithm that assigns Structural descriptions to linguistic 
expressions, according to a linguistic theory that is typically represented as 
a grammar. An expression is a string of abstract symbols, typically words 
and occasionally morphemes. A parser is correct if it assigns to-every Input 
expression exactly the structural descriptions that the generative linguis¬ 
tic theory does. Given our current understanding of non determinism, the 
NP-hardness of language comprehension means that correct natural lan¬ 
guage parsers require an exponentially-increasing amount oi time to parse 
expressions of linearly■increasing length. In short, correct parsers must be 
intractable. (This empirical eon sequence is nothing new] the intractability 
of existing parsers is well-documented.) 

A theory of language processing is an expSicit computational model of the 
language user, that attempts to explain (or at least describe) the compre¬ 
hension, production, and acquisition of Languages. Sometimes sutli a theory 
is called a performance theory. Or a theory of sentence processing, The 
complexity anofyses m fhs's rrpoj’i cfemonSCnste conclusively that the relation 
keltuecu competence and performance, &ciirce?? « §e.ne.rati\te theory rind is 
theory of pneeming, is noi one of limited ability . This fact is contrary to 


115 


the frequently expressed and widely held beliefs of Linguists., psycholinguists, 
and computational Linguists. If Linguistic performance was the limited abil¬ 
ity to use linguistic comp CT-edce, (-hen two consequences would accrue. The 
first is that Language users would have difficulty processing, those utterances 
that arc truiy computationally difficult. The second is that language users 
should not have difficulty processing computationally trivial utterances, 

There ate infinite classes of sentences that may be easily parsed (that is., 
assigned correct structural descriptions by a simple and very fast, algorithm), 
yet these sentences- extremely diflirult for language users to process. 

One Such class is those sentences with trivial obviation graphs, such as 
complete or edge-free graphs. Computing the referential dependencies for 
such utterances is trivial, yet Language users cannot do it. (In fact, they 
appear to have difficulty processing utterances with multi pie antecedents, 
regardless of the obviation relations involved, j A second instance is garden 
path sentences. such as ibe horse rat?f:d jsa.jf Iht; barn /cff h which, are quickly 
parsed by simple algorithms, vet seem extremely difficult for language users 
to process. 

There aTe also infinite classes of cample sentences tbit cannot be ■efficiently 
parted by any known algorithm, yet the** sentences are procHKd effortlessly 
by human language users. One such class is sentences conlaimng manv Jo 
cal ambiguities, sucb as lexically sqnbiguotiS umrdlE, Most sentences are in 
lbi-5 class, and no language user has any difficulty processing them, How¬ 
ever, such sentences bring current parsers to their knees, because it is not 
known how to correctly resolve Lexical ambiguities byjilly. without build* 
ing a complete structural description and therefore being forced to examine 
Ml exponential number of possible parses, A second instance is utterances 
Understood as containing, phonologicalLy covert elements (so-called empty 
categories, such as as traces or PftOj, Detecting empty categories and com¬ 
puting their antecedents is extremely difficult for parsing algorithms, but 
effortless for humauiS. 

It is not at all surprising that attempts to explain so-called “performance 
Limitations 1 ' as resource-bounded competence have all failed. One fixed re¬ 
source bound is never subtle enough to capture the divert range of observed 
empirical facts. In order to have explanatory force, a small number of re¬ 
source hounds must be postulated io explain a large number of seemingly 
unrelated performance limitations. (To postulate a different resource bound 
for every Construction is merely to restate the performance facts.) To »ny 


knowledge, no one been able to explain it truly diverse range of perform Mice 
facts—say from the phonology and syntax, or involving: both referential de¬ 
pendencies and phrase structure—using one resource hound., although many 
have tried. Nor has anyone successfully described even a similar set of per¬ 
formance facts using one resource bound. This may be seen Ln the work of 
Miller and Chomsky (1963), who attempted to calculate a numerical hound 
on the depth of acceptable recursive phrase structure embeddings. However 
their work only served to demonstrate that no fixed bound could he found 
for lliC few constructions they examined, even In in the limited domain of 
phrase structure computations. A second example comes from, the numer¬ 
ous failed attempts to explain garden pnthing as the inability to properly 
resolve a local ambiguity in, phrase structure attachment- The central diffi¬ 
culty in such an endeavor is to explain why some types of local ambiguity 
exhau&t resources, while others don't, and why gfohai ambiguities <which 
should always be more costly) do not. It is not known how to resoiive such 
contradictions. 

Even worse, an account in terms of resource limitanions has never been 
plausibly motivated, that is. shown remotely relevant to human language 
processing. A theory of resource utilization mates, exactly one fundamental 
prediction: that the resource-consuming process must for some input at some 
critical point exhaust the available resources, at which pomt the process. 
wj|J crash. Those inputs that exceed the critical point will he rejected, even 
though they are very similar to other inputs that do not exceed the critical 
point. In order to demonstrate the plausibility of an explanation in terms 
of resource limits, someone must exhibit examples on both sides of Sudl a 
critical point. This has yet to be done. 

Nor can performance limitations be explained as errors in competence. The 
language device cannot be said to make systematic or pervasive errors, be¬ 
cause such errors can exist only with respect to a designer's intentions or 
goals, and the language device was not designed. 3n short, systematic: “’er¬ 
rors* can not be errors In performance, only empirical inadequacies of & par¬ 
ticular competence theory. A real performance error, then, must be inter¬ 
mittent and unexpected, And if such errors are not to be accounted for 
by the competence theory;, as is widely-assumed, ther. they cannot logically 
constitute evidence for or against the competence theory. This is exactly 
the a priori segregation of evidence into relevaut/lrielevant that Chomsky 
(1966) has so powerfully argued against. Empirical evidence for or against a 
scientific theory might in principle ho found anywhere. Linguistics is no dif- 


117 


fersat; the ur.e scientific theory of human language must explain performance 
errors, hecaju.se such errors are relevant evidence for the theory. Language 
Crr£r& cannot have their own scientific theory. To see this, consider Becker 
(1979), who shows how the Independent tiers of the aulosegmeutal model 
can explain facts about speech errors such, as that "’when vowels or syllables 
Of parts of syllables or whole words are substituted or transposed, there is 
no change iu the stress contour of the sentences." [Frankein, 1971:42) 

What is the relation between a generative theory of linguistic knowledge and 
a constructive theory of Language, that explains comprehension, production 
and acquisition? It seems to me that a constructive theory will result from 
the generative theory under the information-theoretic interpretation (out¬ 
lined in chapter i), refined by increasingly subtle principles And limited tv 
the current- state of acquisition:. Empirical facts thought of today as perfor¬ 
mance limitations will be explained tomorrow either as interactions among 
the refined linguistic principles, OT us the incomplete acquisition of linguistic 
knowledge. 

Some so-called performance limitations will be understood as the interac¬ 
tion of increasingly subtle Linguistic principles. One such an account is due 
to Pritchet t (1E)B&), who explained garden path construe dona in terms of 
invariants in the computation of thematic structure. A second instance is 
due to Idsapdl (1989), who accounted for a rajigo of classical performance 
limitations (PP attachment ambiguities, garden paths, and multiple center 
embeddings) in terms of a linguistic constraint on the mapping between a 
syntactic relation (government) and a phonological structure (the intonation 
phrase}. 

A second (more powerful) class of explanations may be obtained by thinking 
of the generative Lheory as tlie theory of acquirable structural descriptions, 
and performance limitations AS temporary, accidental limits in the current 
state of acquisition. On this view, language users learn how to pair ut¬ 
terances with their permissible structural descriptions only after repeated 
exposure to the relevant evidence. Let us consider some examples. The 
naive language user does not easily rerogLllie the ambiguities inherent in 
many utterances, such as ambiguities in lexical choice or quantifier scope, 
but once these ambiguities are pointed out and successfully acquired, they 
are effinrtlessiy detected and produced in novel utterances, Other exam¬ 
ples come from constructions on the frontiers of Linguistics research, such 
us parasitic gaps, strong crossover, and ellipsis. Language users have great 


118 


difficulty comprehending these constructions on their initial exposure. After 
repeated exposure, however, these constructions 0 tfe easily comprehended. 
We would say that knowledge of the binding condhjons is innate, but that 
the language user must acquire anaphoric morphemes and Iftarn how to com¬ 
pute antecedence and obviation in particular structural wnfigurations, such 
as Strong crass over or ellipsis. A third class of examples, such as center 
embedding and garden piths, comes from psycholinguistics- Although it Is 
seldom difrCUWed, the most striking fact about these constructions is that 
after sufficient practice, tlift language user no longer has difficulty pro-cessing 
them- 

A.2 Unbounded idealizations 


A central assumption in this work has been the idealination to an unbounded 
number of input instances, computation it] resources, and Linguistic distinc¬ 
tions. These ’unbounded idealisationS, 1 from a finite set of finite objects 
to an infinite set of finite objects, ate as central to linguistics as they are 
to computer science. Generative linguistic theory and theoretical computer 
science make the same idealisations to unbounded inputs and computational 
resources because they result In the best explanations and, empirical predic¬ 
tions, 

The first idealization, from a necessarily finite set of inputs to an abstract 
infinity of inputs, results in better linguistic theories, finite sets may be 
characterized, by simply listing their elements or by bounding a finite char¬ 
acterization of some infinite superset. The latter idealization to au Infinite 
set gives us a simpler, more predictive, and more interesting chu-acterjza- 
tjoo than any simple listing could. The idealization to unbounded inputs 
gives us potent insights because it necessitates a finite characterization of an 
infinite set, which is only possible if wg have discovered significant structure 
in that set, 

The second idealization, from a class of computations that each uses a finite 
amount of time- and space to infinite computitjonal resources, is central to 
computer science: "To properly capture the notion (f a computation we need 
a potentially infinite memory, even though each computer installation is fl- 
njte. 1 ' {Hoperaft and TUmaa. 1979:1 d) In linguistics, Chomsky (ISofil and 
others have convincingly argued that human language is not a finite state 
system, despite the empirical fact that language users have only finite ca- 


pabilities. Although every Linguistic computation only uses a finite amount 
of resources, viewing human language as a finite state system—as a compu¬ 
tation whose available resources are bounded by a fixed constant—does not 
give us the most explanatory linguistic thewy, 

b general, we make idealisations to simplify the impossibly complex real 
Worlds and therefore LciealilAtiiMS are never subject to direct empirical cop. 
firmation- An idealization is justified only if jt rasuLls in the best scientific 
theory, with the best explanations arid predictions, not ifit is true or not. 

Consider the Newtonian idealisation to point masses, Clearly, it is em¬ 
pirically false; there has never been & point mass, nor will there ever he. 
However this point-mass idealization is useful because it simplifies the com¬ 
putation of interactions among massy objects without distorting the out¬ 
come of that computation. FJowrjver, when two objects are very close, or 
when the computations arc very sensitive, then the point-mass idealization 
hreaks down, and must therefore he abandoned, The -sole justification of 
an idealization is its utility: arguments about the a prior: plausibility of an 
idealization, although perhaps persuasive, are not ultimately relevant. 

Unbounded! idealisations are no different. In this finite world* there will 
never be all infinity of anything- However., the idealization to an infinite- 
set of finite objects (an vnbo undod idtalizatian, hereafter) can be an ex¬ 
tremely useful simplification of a finite set of finite objects whose size can 
vary. An unbounded idealization is especially useful when the infinite set 
is bounded by an order-of-growth function in some natural parameter. For 
C-Xample, in ordeT to restrict the amount of resources used by anv given 
computation while preserving the idealization to infinite resources, we can 
hound resources by a function /( n) in the input length n. Thus* although 
n /(nj-resource bounded. Compulation in principle has access to an infinite 
amount of coptputational resources, it may use no more than /(n) units of 
a given resource on any actual length-m input. Crucially, in an unbounded 
idealization, although objects can be arbitrarily large, each object is finite. 

The idealization to an unbounded number of linguistic features, is no dif¬ 
ferent from any other unbounded idealization. Features are a method of 
representing significant distinctions, where each feature represents an inde¬ 
pendent dimension wherein elements can differ. (The relevant parameter 
is the number of significant distinctions, not the numher of features.] The 
unbounded-fcature idealization does not claim that a language user is ca¬ 
pable of making an infinite number of UnguisLical£y-£jgnLfic&fll distinctions. 


120 


Rather, it claims that laagtLage users arc best seen as being capable of mak¬ 
ing any finite number of distinctions because the number of empirkally- 
observed distinctions is quite large and varies from language to language, 
and even from language user Co language user. In fact, linguistic features are 
intuitively equivalent to computational space, arid therefore the unbounded 
feature idealization is properly included in linguistic theory s uncontroverslal 
idealization to infinite cornp□ littonai resources, 

The goal of a complexity andysis is to characterize the amount of time and 
space needed to solve a given problem in terms nf all computationally rel¬ 
evant inputs. Therefore, the unbounded-feature idealisation is justified on 
complexity-theoretic grounds if the number of linguistic, features affects Lcie 
complexity of a linguistic processes such as language comprehension.- I ne 
proofs in this report conclusively establish that the number of significant diS' 
tin-ctions is a siguiftCMlt parameter of the -complexity of a linguistic process, 
and therefore the idealisation is justified in th* framework of complexity 
theory. 

A central goal of linguistics is to characterize the productive portions of our 
linguistic abilities. Therefore, the unbounded-feature idealization is justified 
on linguistic grounds if the number of linguistically-relevant distinctions i? 
productive. A set of linguistic objects, such as the -sot of lexical entries, is 
productive jf the set is uniform, variable, and large. By urt(fewi, 1 mean 
that linguistic process arc i)Ot sensitive to the exact size of the set, nor is 
each member of the set associated with its own idiosyncratic processes 
rather, linguistic process apply uniformly to n significant subset of the set, 
By variable, 1 mean that, the number and type of linguistic objects varies 
from theory to theory, language to language, and even speaker to speaker. 
By hargr, I mean that the set of linguistic objects is not restricted to a 
handful of such objects. If the sot of linguistically-relevant distinctions is 
uniform, variable, and Large, then it is linguistically productive. This work 
makes unbounded-distinction idealizations for two different dltssof features: 
syntactic features and. pbonmogica] features. Let us consider each in tum- 


A.2.1 Unbounded agreement features 

The set of syntactic distinctions is uniform- that is, syntactic features 
are not associated with their own peculiar idiosyncratic agreement process, 
fundamentally different from all Other agreement processes. In the linguis- 


121 


tic theory of genemlizml phrase sinfciunc grammar* there are only three 
(overlapping) classes of agreement features (HfcA[i f fOOT, and. Cfr^TEtOL), 
to which agreement processes apply. In fezi'cnf functional pnamruar and re¬ 
lated unification grammars, the sole agreement process •(unification} applies 
uniformly to any suhsec, of features., and most cOitiniocdy applies to all fea¬ 
tures together (I = r)- In the theories of Chomsky (1981; 

1902t I&3&), agreement processes apply uniformly to the unbounded Vector 
of “ys-features." 

The set of syntactic distinctions is also variable—different languages em¬ 
ploy ditfeient distinctions, and different theories often postulate wildly dif¬ 
ferent features. I am trying to get results that are invariant across a -wide 
range of linguistic theories. The significance, then, of the fact that the Set of 
syntactic distinctions varies from theory to theory is that this *et will most 
likely be explicitly variable in the ‘true 1 Linguistic theory. As mentioned in 
Chapter 4, pronouns in different languages are marked for a wide range of 
distinctions, Suid those vary considerably from language to language. In so- 
called ttoneonfiguntional languages such as Latin, nouns express many mere 
overt case distinctions than in configurational language such as English. 
The number of agreement features specified on reflexives varies from lan¬ 
guage to language: the Russian object reflexive acfjjo is featureless, whereas 
Modern English reflexive* are fully specified for the person, gender, and 
□umber of their antecedent (Butzlq 198$). 

Finally, in syntactic theories that concern themselves with agreement pro¬ 
cesses the number of distinctions induced by agreement features is certainly 
largo. For example, Finnish is known to have sixteen distinct cases, while 
the Guinness Book of tForfd Records states that Tab ass a ran has 3$ different 
cases, all subject to agreement constraints. In the Aspect.? transformational 
grammar model, syntactic classes are characterized by at least ten binary 
features (nominal, verbal, manner, definite, aux, tense, aspect, predicate, 
adjective, predicate-nominal); pr^pn* iri:n.al phrases are characterized along 
an unbounded number of syntactically-relevant dimensions {*Direction, Du¬ 
ration, Place, Frequency, etc* p.107); nouns are distinguished by at Least 
ten syntactkally-aclLve features (common, abstract, animate, human, count, 
det, gender, number, case, declension-class); and verbs ate distinguished not 
only by such features as object-deletion, transitive, and/or progressive, but 
by their ability to distinguish all other syntactic categories in their eelecllonaJ 
restrictions. QovtrnTTvcn^bmding theories of syntax arc similarly capable of 
enforcing agreement with respect to a large number of distinctions: for CJi- 


121 


ainple, selection al agreement occurs along such dimensions as theme £ agent, 
patient., goal, proposition, etc.) and case jnominalive. accusative, objective. 
obLique, genetive, ergative, dlC-j, in addition to ail the distinctions of the 
Aspects model. In tjeficrcffaeJ phrase slnictun jmmnwr some agreement 
features, such as PFCJFJ1 and SITBCAT, arc capable of making M unbounded 
number of distinctions—&u<J even if all GP5G features were restricted to 
two values, GPSG agreement processes would still be sensitive to the more 
than lO 1 ™ distinctions made by CPSG’s complex feature system (RLstad 
J9SG)- 1-i^icai-fh.nctionai grammar has agreement processes sensilrve to the 
literally infinite number of distinctions that LFG's feature system is capable 
of making (because syntactic categorres in LFG may themselves contain an 
arbitrary number of syntactic categories). 

In short, linguistic support for the idealisation to an unbounded number 
Of syntactic agreement features is quite significant, Now let US consider 
whether the same is true for phonological features. 

A*2."2 Unbounded |jliOITO logical features 

The set of phonological distinctions is uniform with respect to agreement 
(and other phonological processes) because phonological agreement pro¬ 
cesses such as assimilation and harmony apply to natural Claeses, of phono¬ 
logical features. That is. no feature has its own idiosyncratic phonological 
agreement process; rather, one or two phonological agreement. processes ap¬ 
ply to ail natural classes of features, as determined by a language-universal 
feature geometry (Sagey ]0Sfi). 

The set of phonological distinct ions is variable because the set of phone tic 
segments (and articulatory features) varies from language to language, as do 
all abstract phonological distinctions such as degrees of tone, vowel height, 
sonority, and stress, The domain of assimilation processes also varies from 
theory to theory, language to language, and even from speaker to speaker, 
as do morpheme classes. 

Finally, the number of phonologitally significant distinctions is large:. For 
one, the human articulatory apparatus can produce a large Lumber of articulatOrily- 
distlnct and acoustically distinct segments. The transcription key in Halle 
and, dements (1983), lists 32 consonants marked with 12 additional distinc¬ 
tions and 21 v-OWels marked with 7 additional distinctions, for a total of 771 
(sa 624+147) purely phonetic distinction*. Their system employs 21 dis- 


123 


tinctlve feat Tires. Chomsky and Halle use 2£ distinctive features, and 

the aitidilatory tree geometry of Sagey {1 9Sfi ) employs 'll nodes. Phono¬ 
logical processes are additionally sensitive to the distinctions created by 
the sonority hierarchy; syllable structure {onset., nucleus, coda, appendix, 
bran rhin ir/nonbranching structure, number of feet, syllabic weight, etc.); 
tone fa range of discrete steps from highest to lowest in addition to rising 
and falling tones, and tonal downsteps); stress (degree and type of foot); 
and so forth. Morphological processes are sensitive to a]] those phonological 
•dlatlictiona, plus a let of morpheme class distinctions that is Itself uniform, 
Y&riabie and Urge, and hence best seen as unbounded For example, there 
are upwards of twenty noun classes in the Bantu Languages, and no reason 
to believe "noun class V in a Bantu language is in. any sense the same as 
‘atOUfi class 1' in a Romance language. 

The number el articulatory (phonetic) distinctions would seem to be bounded 
by human physiology. But there Ls significant evidence that the bound is 
not a const sort, even with a fitted set Of primary articulators. Many features 
Such as vowel heighlj lone, and sonority may be bent seen as the arbitrary 
discretization of an infinite continuum, a kind of scale. Some languages have 
Sltt degrees of vowel height, while others have only three, and certainly every 
language can have its own sonority hierarchy and tonal inventory. Moreover, 
there is no reason to believe that that the language faculty is incapable of 
using additional articulators, were they made available. For example, speak¬ 
ing often is accompanied by the meaningful use of hand gestures and facial 
expressions and the movement of -secondary articulators such JUS the jaw. 
Thus, although the number of muscles in our bodies is a (large) constant, 
the language faculty doe.s not appear to be tied lo a fixed set of muscles 
(witness sign languages) or muscular movements, and therefore the idealisa¬ 
tion to an unbounded number of articulatory features may he empirically 
correct, in addition to being, theoretically justified on grounds of produc¬ 
tivity (being uniform, variable, and large). In fact, the language faculty 
may be maximizing the number of perceptually observable distinctions in 
the domain of a given sensory-motor system (Stevens and Keyser 195?). 
Therefore, if the human motor system were capable of producing additional 
perceptually distinct segments, the language faculty might employ them. 

In conclusion, there is significant support for the idealization to an un¬ 
bounded number of linguistic distinctions in both phonology and syntax. 
To assume otherwise is bo vastly complicate linguistic theory. To argue 
Otherwise is to present a finite language-universal list of aU possible Jinguis- 


124 


tic&ity signMcwit features;, a project which has yet to begin and unlikely to 
finish in this century. 


Limiting; unbounded 3 de alls: at in ns 

It does, however, seem reasonable to limit the number of distinctions by 
some sharp orefef-©f-growth in a nature] paryutteter. The natural parame¬ 
ter fur language lt-arncr might be the amount of lime spent acquiring the 
language; in the case of a computational complexity analysis, the natural 
parameter is the else of the input to the reduction. The polynomial time 
bound on reductions limits uS if specifying a feature system, with no more 

JlnJ 

than a polynomial /f n) number of symbols, which can make at most k Ul 
distinctions for A-ary features, which is maximal when the features are bi¬ 
nary. Some stricter Limits include no more than an exponential number of 
distinctions (linear number ft ■ n of binary features) or a polynomial 
number of distinctions. (logarithmic number k ■ log n of binary features). 
It is desirable to limit the number of distinctions, available lo a reduction 
because Ibis forces us lo use Other unbounded linguLsticaily-sigruficaisl dis¬ 
tinctions, baaed on other linguistic structures, id order to simulate Increas¬ 
ingly complex computations. In each proof, I explicitly State the number of 
distinctions required for that proof to succeed. 


Appendix B 

Structure of Elliptical 
Dependencies 


The goal of this appendix is to provide m analysis of referential dependencies 
in elliptical contexts, such AS VP ellipsis <70a) and CP-ellipsjs (70b), that 
does not make use of a copy operation- 


<™V 

b. 


FeEix [hates his neighbors] and so does Max |e], 

I'eltx told Kyle [that he hates his neighbors] and Max told Lester 

[«]■ 


I argue that tlic facts of elliptical dependencies tan be accounted for by- 
two representationai innovations. First, ellipsis is analysed as identical ly- 
composed thematic-structure shared hetweeji the overt structure atld the 
corresponding null structure. Seooild, the tw relations of referential depen¬ 
dency, link and obviate, are generalized to hold, between positions in the 
thematic-Structure as well as positions in the phrase-structure. 

Before proceeding, iet us establish some terminology-. U r e say two elements 
if Mid in different structures ro rncspprtd! when they ato in. equivalent posi¬ 
tions and receive equivalent interpretation*, assurcjng an appropriate notion 
of structural equivalence. In example (70}., Fcihz and Mar. correspond, as do 
Kyk and iejter, 

As wo saw in chapter 4. the central theoretical problem posed by ellipsis 
is that the invisible structure must be an independent copy of the overt 


126 


structure; yet at the same time., it cannot ha. 1 

At beginning of section i.% I prestate<1 evidence that the overt structure 
nitist he copied to tlic position of null structure in the syntax that copying 
it a fCCUTSlve process, and that anaphoric elements may be iinked to their 
antecedents cither before or after the copying (the copy-and-lin5f theory 11). 
N'eKt, I falsified this theory by showing that the null structure is not an 
independent copy of the overt structure, because (!) the original and its copy 
do not obey the &?_me post-copying Linguistic constraints, and (ii) processes 
tbit apply after copying, such as finking, do not apply independently in both 
the original and its copy. To resolve this apparent parados, that copying is 
both necessary and impossible, I sketched an empirically superior prcdica-le- 
sharjng theory 4.2 at the end of section 4,2, 

In this appendix;, 1 fill in the details of such theory, and defend it. Briefly, I 
propose that the invisible VF shares the thematic-structure of the overt VP, 
but not its phrase-structure or phonology. 1 also generalize the two relations 
of referential dependency t link and obviate, to iLold between positions in the 
thematic-structure or phrase-struclure, Let o be an anaphoric element in 
the overt VF, 0 an argument of the head -uf that VP, and ;i r the corresponding 
argument of the invisible VP. The invariant interpretation of ft arises when 
ft links to the phrase-structure position of 0. The covariant interpretation 
arises when a links to the thematic-position assigned to because then Or 
also links to the thematic-position assigned io 3 '. 

Now iet q- locally obviate the thematic-position assigned to @ f according to 
binding condition id, Then a also locally obviates the corresponding argu¬ 
ment & 1 because the game thematic- position is assigned to both 3 and 3 1 - 
The cases of invisible obviation arise when o is (inked to some antecedent 
7 and 3* is coreferentiaJ with 7, Then ft is both obviative and ooreferfflUtlal 
with which is a contradiction. Thu details of this analysis may be found 
below in section B,4,3- 

It is of course possible to develop an alternate analysis, that does not refer 
to a level of thematic-structure, and does not generalize linking and obvia¬ 
tion to thematic-positions. Such an analysis is sketched jn section 0-3; it is 

’far historical juaeor.*. ellipsis sphenciSItoriiDn Lels been ca-iLed “deletion. * Essentially. fifi* 
dIiwi> u rideclvlii.g]} “nv-ndbtiiKt'' siibaimct-aieiin -i structural ■ifcecrip-tLon could bo deL-cw^t 
in ih .0 [J-ilmctiiTG to S.ilmcture tLeriradoii (R<>0& Keenan ISTll, La the logical term 

la S-jtnicL’jic derivation (McCauley 136?), OC in ike IS rimeture Id 5PF derivation (&»g 
1STC}. 



conside r a. b y less clec.ant. My central mod ration Ln this research, however, 
is to accumulate evidence for the constructive complexity thesis for hu man 
language. In tht Introduction to the report, I argue that human language 
has the structure of an PfP.complete problem. That is. the process of con¬ 
structing linguistic representations is bounded above by tfV and below bv 
NP-hardness. As proved in chapter 4, the copy-and-link analysis of ellipsis 
leads to a complexity outside of MV (]q fact, to PSFACE-hardness). By 
eliminating the recursive copy operation from linguistic theory, we provnhly 
Tsduee the complexity of representing ellipsis from PSPACE-hardlim to in. 
side MV. The fact that such a reduction In complexity is possible constitutes 
empirical evidence for the MV upper-hound. 

This remainder of appendix is organiled as follows. In the next section, sec¬ 
tion B.l, previous work is reviewed in an attempt to iUuminate the inherent 
structure of an adequate account of elliptical depen den tires. We begin with 
the earliest theories of VPE, confront these theories with empirical difficul¬ 
ties,. and in this manner move to successively more sophisticated theories. 
Section B.2 presents the phenomenon of invisible obviation in detail. Sec¬ 
tion B.3 discusses the necessary Structure of an adequate theory of ellipsis. 
Section E.4 proposes an explicit system of representation as it applies to 
VPE, and section B-& illustrates it for the tricky cases of invisible obviation, 
invisible crossover, recursive ellipsis, and nonaubjccl covariance. 

B.l Previous work reconsidered 

It is the central insight of early work on VP-ellipsis that fecth overt and 
r.-ull V Ps Correspond to identical underlying predicates: either the pronoun 
kiff Inside the identical predicates refers to a constant (either Or Felix 
in example (G3)), resulting in lhe invariant interpretation, or it refers to an 
argument o: the predicate (in this c-ase, the external argument), resulting 
in the ccvariujlt interpretation, In this chapter, we accumulate evidence 
for a refinement of this visw.^ It this Is so, then the central research ques- 
tions are how to represent the predicates, and what constitutes identity of 

. Lc idi niily d! jirnrlirati^" alwciTatiou hi* tetri madr Ln. icpie furio o: Esther, ap. 
ptraisdr tndepsndtndy, by & vriiSr- tinge oE author* including McCWlev fl&ST), Ke*n B n 
< 1S71 V IwniJc tlSTShSd* (M76), WniiMU (1&I7), and JteLullirt (19S3). Kttnu'i work 
ej piitumtuiy valuable for Uk simplicity dlius rnesruUtion. .Hid Sa 5 ' 5 hi the b-«a<l-.h 
LLs empirical I L11 ;j.x. ic h; 



predication? 


71,1.1 Covariance reduced to predication of subjects 

The guiding idea of bolli. Sag (1976) and Williams (1S77) is lo reduce 
anaphoric covariance lo the predication of subjects, Tills idea may be 
described informally as follows. At some level of representation after S- 
structure—either at E.F nr in the discourse grammar—a VP is represented 
as a one-place predicate, where the external argument of the VP is bound 
by a A-varlable inside the VP* as Lid (Ai-fi eal (firmer)). EKiirct9ifi.ff 
some amount of charitable reconstruction, w may say that pronouns are 
assigned free "referential" indices at U-structure. The grammar contains 
an FrCtUCnu Rule that optionally replaces- an anaphoric element coindexed 
wuth the subject with a variable hound by the A-opcrutOr, as in (Ar.fas flat 
i filmed). At some subsequent level of representation, the A-expression 
corresponding to the overt VP is copied to tlLe position of the mill VP. 'Tire 
■covariant interpretation is obtained when the Pronoun Rule applies, the 
invariant interpretation when it duos not. 

Although the particular mechanism of A-abstraction is not a natural com¬ 
ponent of the current principle-Uld-parameter framework, this idea may be 
easily implemented in a Dumber of other ways using mechanisms that have 
been independently motivated. To illustrate the central issues, we -consider 
two mechanisms: predicate structure arid VF-internal subject. In either 
case, the LF representation of VP-ellfpsis is interpreted (after LF) as if cjie 
VP predicate appeared in the position of both overt and null VPs. 

First, we may appeal to a suitably modified version of Williams* (]9£M3j 
predicate SteuCCvre. where the subject-predicate relation is- re presented, by 
coindexing each predicate and its subject. The covariant interpretation, 
where the anaphoric element a refers to the argument of the predicate, is 
represented by assigning the same variable index to the predicate and its 
embedded, referentiaUy-dependccit element «, its in M«i [ate hts^- dinner]^ 
The invariant interpretation,, where refers to the matrix subject* is repre¬ 
sented by assigning the same constant index to -O' and the matrix subject, as 
in /obit] /ate fo’si (iimiflr/,. (Here we temporarily depart from our convention 
of using subscripts lo repttSt-flt speaker-hearer judgements.) 

Second* highly articulated phrase structure can give rise to a linking ambi¬ 
guity. For example, we might postulate a VP-internal subject position, fol- 


129 


Losing Fukui (1986], Kitagswva {E980), Koopman and Sportiche {1&B5; 1D&6) ; 
and Kuroda (iGtSb), Now every embedded prOnOun o&refororitiaj with the 
clausal subject may be ambiguously linked directly to the subject, Of to the 
empty position in the VF that cfc itself bilked to the subject- In the JatteT 
case, we may obtain ihe c-ovariant interpretation; in ike former case, ve 
obtain the invariant i&teJpMt&tion when the subject is a logical constant, 
such as a proper noun or definite NP. A typical surfaue form is; 

(77) [s[rMpT^LfctJi (vp[rs(»e] T fy,hates [ NP his neighbors]]]?] 

and [so does [Mas [vP e bS 

When the subject of first conjunct is a logical constant such as the proper 
noun Fcfri, then the embedded pronoun fns car. be Linked either dliectly to 
that logical constant to obtain the invariant reading, or to the VPdnttmaJ 
specifier position [h’pejj [which is itself a logic ej variable linked to the sub¬ 
ject) to obtain, the covamnt reading. 


B.lr2 The problem of nonsubjcct covariance 

The central prediction of any such theory is that the covariant interpretation 
is only available for anaphoric elements co referential with the subject of a 
predicate that contains them. This prediction appears to be false, As Rein¬ 
hart (1583:152} observes, cov&mnt readings are available when the relevant 
ellipsis is POt VF and when the antecedent not a subject; 

^ ^a, We paid [the professor] [ hisi expenses, but not [the Student)^ 

(|use didn’t pay (the atudentfa hi&iji expenses]) 
b. The nurse referred Siegfried i to hisi doctor, and Felisfa too, 

(|ttc nurse rc/errc d Fe!it% to ifocfoi’] ) 

C. You can keep Rosai in hen room for the whole afternoon, and 
ZeLdaj too. 

(fjjOu Obtj keep Zeldai an ftcT]y 2 raoor ..,}) 

The si tuples t solution to this difficulty is to assign a new phrase structure 
to these constructions, where whit were objects become subjects, Then 
the Hag-WiLiiarns analysis, which redness -covariance to the predication of 
subjects, would Still apply, me (offs mufondtr, One such approach, due to 
Kayne (1981; 1984), analyzes the double objects of the verbs in (72) as small 
clauses, os in (73). 


130 


(73) Wfr believe [j sc John V [a genius)] and [sc Bill [r]]] too- 
([a jjcnnrs]) 

Kiyae suggests that small clauses of the form [sc NP NT] contain an embed¬ 
ded abstract verb-like element V that expresses the thematic relationship 
between the two objects. (la the case of (73a), Kayne wDoldl postulate Ail 
underlying abstract ‘be’ element, ie., [belie-v# fkf lis NP]]]; in I. •■-la}, Knync 
would postulate an a-b&tra-cl ‘Lave 7 element, ic., [pay [NP [Las NF)])-) 1 hen 
Reinhart's exam pies (72) would be assigned the phrase structures shown in 

(74) : 

y,A \. We p-aj-d [[sc the professor, [V his; expenses]]] hut not (sc the 
student [cjj]. 

b, The nurse referred [[gc Siegfried t [V to LiSr doctor]].] and [gy 
Felix je]]] too], 

c. You can keep [[sc Rosa^ [V in her,: room for the whole 
afternoon]]| and [sc Zelda (e]i] tool, 

A second Approach, due to LarSOfi {19SS), would assign the highly articulated 
phrase structures in (7b) to Keinhart’s examples from (i'2)i 

1 1 ,J a. We paid, ([vp [the pnsfeeearlj [vl ft. tj] [hw expenses]]] but not 
\vr [the student]* [fl]||- 

b- The nurse Teferred, [[vP Siegfried [i, ]lo his doctor]] and (vp 
Felix |4f too], 

c. You can keep, [[vp Rosa [l, [vp |in her room] [i, [for the whole 
afternoon]]))] und [vp 'Zelda [c'] Looj- 

These novel phrase structures also hold the promise of assisting our analysis 
of covniiijLt interpretations in CP-elhpsis, in (76-), (Keenan (1971) aaa- 
lygCS these conatiurtions, which he calLs deletion, and demonstrates that 
the cova-riasLt corefeienre relatioLLs in the elliptical danse may be arbitrarily 
complex.) 

(76) John] told Ectty 2 [that Let thought shea ^els drunk] and 
OrvilJej told Naomi* [t] (too) 

([fhat thought she 7 f 4 tuns drflTlA]) 


131 


As before, both the agent (subject) and benefactor (object) of tell would be 
underlyltLgly subjects of predicites„ and hence both would be available as 
the untecedetii of a covamnt Interpretation. 

Observe that CP-eUip*ia (77a) and PF-dUpm (77b) must be distinguished 
fiom the cotces ponding NP-eLLipsis, as in (78), 

'■ ' ' 'a. Sally told John [cp that cookies had been baked] and Deb told 
Andy (cj tOO- 

b. Sally iolfl John [ P p about, the fresh-hahed cookies] and Deb told 
Andy [s] too. 

('^a, *? Sally told John [np usl interesting story \ and Deb told Andy 
[e] too, 

b, * Sally gave John [^p frefch'b&lffid cookies.] and Deb gave Andy 
[e] too. 

This distinction appears to he related to underiving differences in thema-tdC* 
structure. It particular, the till of (77) as Lke in/orm in that it permits an 
Optional theme argument, as- in Sully informed Jvhr\ whereas the piW and 
ieiJuf (78) are Like relate En that both require an obligatory theme argument, 
as in * 5 bi% rtfaied/gavt John. Thus, it seems that ehinsed arguments do 
pot satisfy obligatory selection al constraints. 

These facts present a serious difficulty for an approach that attempts to 
reduce either nousubjeet covariance or CP-ellipsis to Vl-eliipwa under the 
Kayne/LatSOU analysis of double object constructions. In such an approach, 
the benefactive antecedent 3 is in the specifier-of-VP position, and the el- 
lip sed structure is a VI. So a rule of Vl-dlipsis Ls needed. However, the 
unacceptable example (IS(la) would then he assigned the permissible struc¬ 
ture in (StJb or c), showing that not all VI constituents may be the target 
of ellipsis. 3 

J A Cue tShrr itchruciltlLlficuUy, particular to L-A[8G.n'f. an.aJy»ii ot doubt* object-CCUittmc- 
lioa*. OCCL 1 CE in eiiK* at Heavy NP Shift, wtlcli Urnoa Mialyiffl M Vl— VO icanalyfiifl 
fottowed by Llfclit Predicate Raising In tbcsn eaieH. ih* ccunlaiied analysis LncowetUy 
predicts Lfcit the covananL jr.letrrCUlisu is n« available when thft indirect object Is the 
aul«E(icnl (T'dai, simply because tUr md:ita;t object has bent mcoipoiraied into the Vil 
and it* nalon^et * subject, as sb.ow>i in {"9b): 


132 



(S0) a 

b. 

C, 


* Sail.V gave John fresh-bfcked cookies and Deb gave Andy [e] 

[" a JJy gave (John [vi V fresh-baked cookies] jj and [Deh gave 
[Andy [ V1 p]]] too. 

Sally gave,- [[ VP John, [vi (*, tj] [fresh-baked cookiesJ|])j and 
[Deb gavejt [[vf Andy,' [vr tfl]] too. 


For this reason, nonsubjiect COVirian.ee and C P-eUlp-StS remain Open problems 
jn this approach. 


B.1,3 Covariance reduced to bound anaphora 

Reinhart s solution to the difficulties posed by nonsiibject covariance and 
CP-ellipsis is to reduce the co-variant interpretation of anaphora to the 
bound-Variable interpretation of anaphora. This solution, following an ear¬ 
lier suggestion due to Lasnik (1970:20),. is based on the observation that the 
covariant interpretation of an anaphoric dement Ot Coreferentia] with an ar¬ 
gument 3 U available if and only if » can be interpreted as a hound variable 
in the scope of a QNP Esn the position of 3‘. 

| J"i 1 V 

a. We paid [every man',, his, expenses. 

b. The nurse referred [every victim]; to his; doctor. 

c. Yon can keep [some woman]; in her; room for the whole after¬ 
noon,. 

Reinhart- crucially distinguishes bound anaphora from pragmatic or acciden¬ 
tal ocueference. Accidental coreference is an extra-syntactic relation between 
tW(.n KPs. either of which may or may not be refereetially dependent- Bound 
anaphora is a syntactic relation between an IVP 0 and an anaphoric element 
a that is understood as a variable bound by fi. It is represented by corndex- 
Ing a and 3 t subject to the following conditions; (j) 3 c-commands ft: and 

{791 

*- Wn to [Lhe piDf«wfji tkt nrart DuLtandish tJial hq had el -« r 

ijicLirri-ij. 4nJ ta [hisL sturirntjg «iit 

(E we pii.i to jhiai itudentjz th( moA oultandiih r.zpt.nMi jAaf , l as 3 hod Ovrr 
prtinmd]} 

fe, We [vi— tir# p»kl t* ■he piDfwwr], Jvp ftlu; inoj-L a-jllin.disli uspsnsej. that 

he had ever Ln-rucred] d„] 

MtuEDver, thn pwnour, is na langct e-comaiajidBii by its uitKfldtoil, which will hJsd Lacar- 
eetcly black the deaiced cavariant LnterpreE*tiaa for time BllUCtUrea. 



(ii) if a is a pronoun^ then $ earned be dominated by the minima] governi [Lg 
category mgc(ct) dominating ci; (ill) otherwise- ct is a reciprocal or r^f-fiX-tve., 
aad then 8 must be domin.at'Ed by Tn£c(n). The semantic interpretation of 
1 CP containing ft i-R given by the rule i.SS),, 'V'h.Lch A-abstracts ft from ( f 
and replaces al! a coindex-ed wLtli (and hence c-commanded by) ft by the 
A- triable that replaced ft. 

(82) [ C p * *) * [cp & {**•[** f*])] 

Example (8-1 a) is assigned the surface structure (R3a) by the coindexlng rule 
and the semantic iinterpretation (-53b] by the rule (82)- 

^ HJ V We [y F paid [every man]* [his, expenses;], 
b. ([Every m&n] (Az.[we paid £ fc's expends]])), 

It joums that this system is meant to apply to VPE as follows-, The CP that 
contains the overt VP is- interpreted by A-abstracting some of its arguments 
to form a A-exprcssion E\ the €P lhat contains the null VP 19 interpreted 
by A-abstract! fig its overt arguments., and then Applying them to E. The CO- 
variant interpretation would arise when A A- abstracted argument ft has been 
coindexed with anaphoric dements in ibe syntax, as in (84a); the invariant 
interpretation when ft is "accidentally’' corefejential with An anaphoric, ele¬ 
ment, as in (84b); 


'^'a, ([The professor) (Az„|we paid J [-'s ectpenses]])) 

but not ^[Tbe student] (Ai,(we paid JS [x'e expenses]])) 
b. ([The pfof«*or]i (Ai.[we paid z [liiRi expenses]])] 

but not {[The studentj (At.[ we paid t [hiiBi expenses]])) 

This proposal, is missing many crucial details; Reinhart- does not supply 
tfeero- Perhaps they can be supplied. There lS also the question o: the 
adequacy of the proposed theory of anaphora. 11 Nonetheless let us assume 
that it is correct in order to evaluate this approach to covariance. J 

N For Oht, E un, not convinced lhaL the el aimed disjoint-rekeenoe rtnsBqaKricea (olio* 
fiQijn the ptOJMWcd piasmitk: Gricean lh«wy based e-n ipeaker/beitrf intentions. LiSJiO< 
{IflSS) dj*cvs5,K& these oaves and ether rmpsrinal faiiLapj—d-9- not overtook tus In 1. 

*Even if-dumine, th&l t|»e proposed technical ayntem <booed anaphora only if c- 

cammand hold*) covers rtie cnecral ciiea. Reinhart febipter S j and ethers have ohseTred, 


134 



][. is not dear what conditions permit an argument in th.0 elliptical struc¬ 
ture to be applied to the A-Abstrac ted predicate constructed from the overt 
danse. For example, Reklart’s system fails to explain the existence of an 
independent tense marker in the elliptical clause. That is, why should the 
subject of the elliptical VP require an Inflected agentivie do, and why can 
this inflection differ from the corresponding inflection in the overt VP? 

f 

‘a. Fdix haled his neighbors, but Max (still) does, 
b. Felix hates his neighbors, and Max did too. 

Example (S8) demonstrates that this is not a question of the ease Alter 
applied to the subject of the elliptical ctaisse: 

(8S) Felix t may hnte his uachb-Drs. but not MiiXj. 

(ie., Mktj tUxtn V , r i ntf: Af#, fi neighbors) 

Recall also the casea of nonsubject cqw&rlafiei in (72), where the ovprt an¬ 
tecedent of the Mvariant pronoun h entirely by itself in the elliptical clause. 

thic bound ir.ipr.Dti aie available tWn m-h.ee. c-camnutltd Jdbb nat obtain. Tn idditiflji. 
Reinhart's cvmUtiOB (Co^niiDt inierpifi^tiun il and only IF bnuM AtUphora Luteipietj.- 
tli "l I 115 Exception*. For example, Reinhart piedicsa that (Sib) has a covatiaat internre- 
taijpti precireijr because (ft.'je.) his a bcund variable interpretation: 

(Si) . 

4 * *- Zelda thought about |evrry rtiau], on his, weddinH diy, 

h- Zelda 1 bought abmir 'every man], on his, wedding day and aho-iit FfliXi 
coti 

([theinfill atcu-f FeJii? an At-#} toeddmp day)I 

Hwevw when a p-TOpet noun. re places the QXF antecedent in thi* example, the covartSTit 
interpretation is crucially ortf available for same GpeikcrS: 

(8f) ' Zel4& thought about Siegftied] tm hlii weddinjj day and about FeEixa tc-o 

(Ithailtjhl (jlVdot F rfir; Oft Au L yi 2 rntjirfwi datfj 

[In tncl, my infpriTmnte claim Lh*e (M) can only Jne-m Lhat J 2e!d» UioughL about Felix 
twk ’J Tills ton Least presents difficulties [or tteLnhart’a cbtielatian, as. w«|] ^ lot dLIiet 
I hEDTies. Th* natural solution is to appeal the imrinctian between uuunLLfiErs and proper 
libilu, ie., the former are assigned script; while the latter are net. There are a ryunibw of 
■rt'iyu LO implement this proposal—far example, p-etiapH anaphoric dependencies ftlicta- 
Lirtkr.ih Lists case; must b« established under C-^PmmSiid between the gcopai marker of the 
■qu.sBdfi.tx and the anaphor CtMt variable. ] return to these diiliedries liefow iri section 13 <i.J. 


135 



A central property of Reinhart's system* and of the other systems we have 
considered, is a fundamental asymmetry between overt and mil] structures. 
Relations of anaphoric antecedence are established the overt Structure, sub¬ 
ject to syntactic constraints obtaining, in that structure, and then applied to 
theaigumentfs) of the null structure. So, if an anaphoric element a links to 
an argument $ in the overt Structure* then the copy of a in the null structure 
will also be allowed to link to the argument corresponding to 0. 


B.2 Invisible obviation 

.Now consider the discourse (59), and its variant (3"), repeated here as 

m- 

(S3 1 } Ann' Romeoi wants Rosaline? to [love him||. 

Bern Not any more—now Romeoj wants Juliet^ to [cj* 

([(otw him t ]) 

(9D) Ann: Kom.eo t wants Rosaline? to (love him.], (t = 1} 

Ren.: Mot any more—now Rosaline? wants RomeO] to [i]- 

{| Jour irfru,], i -fi 1) 

[n both examples. Ann's use of the pTcmoun him is most naturally under¬ 
stood as referring to Uotueo. Yet when Ben replies in example (&D}, the 
roreferenlial interpretation (i ~ 1) is no longer possible ill Ann s statement. 
This “invisible"' relation of local obviation ran also be created entirely within 
a sentence, with the pronoun understood as first including but Later obviat¬ 
ing the argument, Borneo: 

f&l) Rrjrtieo-! wanted Rosaline? lo [love hiraj before wanting himselfi 
to [e]. 

Similarly, an R-expression is “invisibly obviative" from sis local c-Commnitdtrs, 
as 3 n (92), where pragmatic considerations strongly favor a eoreferential in¬ 
terpretation that can only be excluded by syntactic principles, 

(92.) Sue lilies N"arcEssus L and he* ( does |e] too. 


136 


There are a number of tub tie-ties, however, the most interesting of which 
is that Invisible obviation is entirely a local phenomena, as illustrated in 
(93). 


(93) 

a-. ITe,] knew Juliet loved Romeoi . 

h- The nurse [knew Juliet loved Romeo L [ before he^ did [e]„ 
(itnoe- Juhtt forc’d Romet? L ]) 


Although the pronoun he nmst bo obviative from the R-expressiOil Romeo 
tba-t it overtly e-eommands in (03a),, it need not be obviativc from the R- 
expiessio-n that it invisibly c-commands in (&3b), 

In fart, the domain of invisible obviation is exactly the local domain of 
binding Condition B. Let position i c-command position j jfl a phrase- marker. 
Then invisible obviation holds between positions i and j if a pronoun in 
position j Would be obviative front an. argument in position j (unless P of 
Morse, there is an inaphoi is position j') r 

This Invisible Obviation Condition (IOC), a descriptive generalisation that 
follows from deeper principles discussed below, is illustrated, by the follow¬ 
ing examples, for both pronouns (96b,97b) and R-expressions (960,97c} c- 
ccunmanded jp position j: 8,7 

W-Jxars that both o*m and sarisible ecu eti tta-n C effect* bitween twa rt-r-xptMsions 
cin T?C ffWHHU -with heavy phonidcnsHiaJ stress, as in (94), 


I lI ^- a. BLLLi wanted BILL] ta .kiss Mary. 

b. Buc | wanted BILLi Lo kiss Maty) and BILL, did [c] tan. 

wliireas invinUe canrtitwb C elicits between an B-4.i{t-iesHn.n and a c-cammandms pro- 
n*un Wf* iavLcJable, reyardif-SS *1 the an"uni orilrraa (Eli). 


^ *. * He/HE, wanted Ei0/B1LL : la kiss Mary, 

b. * Sun fwanted BjII/BILLj ta kiss MaryJ and he/HE| did fe] iry>. 


The a* ample*. in ( 96 b| and ^ Q-TtJ- are eautruted using a lanfcpi* anlecedeDL iPiif tc 
mare clea.Hr reveal the invibbbe abvi*tiaE can fi gu riiiart. Ik^vei. l!ie IOC appears to 
overlap ia ihc^tr e*ampies rvilii an indrpende.nL (not nEidc-rslaciil} ftlfiBl taint that fit-dudes 
soma croK-oanjusn:V antecedences, hZ L& wanted ftim+j Cp un (w»ti 7am; wanted himteij 
tty U'Fn (too). 


137 



a. Btllt wanted him.i to kiss Mary. 

b. Sue [wanted hlir-it to kiss Mary] and Billi did [ej too, 

c. Sue [wanted Billi to kiss Mary] and Ium did [c| loo. 


Billi wants PROi to love him,i, 

b. Sue wants Miry to [love him Tl ] and Bill x want* FK-0] to (c) 

(too). 

e, Sue wants him| to [S&ve Mary] and Bills wants PRQi to [e] (too). 

Examples (98) demonstrate that the nonlocal obviation defined by binding, 
condition C is not relevant to the IOC, 

'■'^ ‘a, EHli wanted Mary to kiss hiniL. 

b. EIe„] wanted Maty to kiss Bill j, 

c. Sue :wanted Mary to kiss IusYli , and Billi did ]c] too. 

d. Sue wanted Mary to kiss Bill]] and he] did [e] too. 

The l’aet that the IOC should be defined relative to condition B wild not 
in terms of (the negation of) condition A is illustrated with a prepositional 
adjunct in (99), and with a possessive in (1130). 

BiLlt saw a snake near hi-raj/himselfi, 
b. He.] s&w a snake near Bill]. 

C. Tom [saw a snake near BiUij before hej did [e], 

Billi knew that pictures of him|)himse3ri would be on &ale- 
b. He,i knew that pictures of Bill] would be *n sale. 

C. Sue [knew that pictures of Billi would he on sale| before bei did 

w- 

At noted in the introduction, the invisible structure is not an invisible pro¬ 
noun, simply because there is no invisible obviation when an oven pronoun 
is used [39!> n 10Jh) instead of ellipsis { 3&U-, 102a) i 3 


4 Tht fffnrtj (if Limn Me obviation *K p ran an need when the isvlmblE ptonoun 

oMalwt an nnapliDtLC Elemejil that must bavn a lacal nulecttlvnt, such an ail 4B4phoi- 



a. .Juliet] thought tELat the Priam [poisoned tm-fijl without realizing 
she*i did [-sj. 

b. Juliet] thought the t>j^r 2 [poisoned lier]^ without realizing that 
shei did it 3 . 

If the null Structure were si triply in empty ore-noun at LF. then there would 
be no way to explain the lack of invisible obviation Isl (3SI, 102b). 

The following examples are particularly interesting because they demon¬ 
strate that local obviation in the overt structure is preserved in the null 
structure* even when it is embedded one leveL, as in |! 0'?!, Or more than one 
level, as in (10d)- 


Bill; [wants PRO; to love him ..,•]. 

b. Sue,- [wants FRO; to Jove Bill,] and he.j does cj too. 

Cr Sue, [want; PRO; to love him.]] and Billi does t'j too. 


fl04> 

a. BUI; [expects PRG, to want PRO; to Jove him,;]. 

b. Sue, fexpetls PRO; to want PRO, to love Dill,] and he TL does 
ftf] too. 

c. Sue; [expects PRO, to want PRO; to love him.]] and Bit], does 
£e] too. 

Contrast these examples to the examples f I Hi), which show- that the lioniocal 
obviation of condition C is not sirruJSurly preserved under embedding. 

.tc only CQhflgurations wilfc (]us properly require an eliiptwt] Infinitival VP, wtare the 
subject <iT tht Bull infinitival VP i* an andptsr as in (Lfl|a.|, In audi a rnafixura lion, 
hm^ra, it ,b net prasibie Id directly prcm.DjnLnaiiie tbe overt VP (IQlb}, per&ups for 
reAtens having tij •&> '"ilk ike cus filter Instead) m^Kt ijitrorilino tlie ucitim dgr 44 
in (Idle). 

,! Kamecij asted. the apoikM-ary-j te [kill hinujy.g] before telling himself i to 

M- 

b. * Hornet iskrd the apothecary to [VLIL binn] before tolluij lurresdi to it. 

c. Stflmefti Asked tke ipctbiq«y s to [kill kimt/rb before teQtRg liimseir, to do 
Lta. 


Ibesc eiiAMpiej raisn doubts m 10 whether we cu ™pd*to*cly view both da- and to as 

realization* i>I In. 



b. Ho, [wants PR.O, to know if Mary IoV4* BL]L| ^ {i / 1) 

c. Suo, [wants FRO, to know if Mary loves Billi] and hi-] does [*] 
loo-, 

d. Sue, [wants PRO,; to know if Mary loves himi[ and Billi does 
[t] loo. 

Those facts are exactly in accordance with the IOC- 3 

These apparently novel examples provide powerful evidence for the structure 
Of elliptical dependencies- For one, they urgUO against any non-syntactic ac¬ 
count: neither the discourse grammar of Williams (1077) not the semantic 
interpretation of Reinhart [135!]) arc able 1, to maintain the purely syir.ac.tir, 
(ie-, senlenee4evel) distinction between local and nonlocal obviation nec¬ 
essary Tor the IOC- Second, they argue against the standard asymmetric 
account> where the coreference relations in the overt structure are simply 
imposed on ibo null structure in a manner that satisfies "identity of predica¬ 
tion." Third, they also constitute new empirical evidence for the existent* 
■ f .,-! . r 'r i'ij rion of obviation il ro.aticai sj of arilOOCdopG&t if 

computed in the syntax at S-slrueture and subject to semantic interpreta¬ 
tion (Lasnik ]&7&:13oli Chomsky J930; Finer 1934; Higginbotham 19S&)- 1 D 

1 Chomsky f L98 -]) and. alhfrf iUth-OTH have accounted far ftltO&K CT O BW ar phenomenon 
is a condition C effect Tht fact that condition C effects do not arise in diipticaJ structures 
jives us a di-Hut ejtspiucaJ test tor this hypothesis The tacit r,f invisthLc strong crossover 
in (10S) jrseaB i hit simr f- crossover cannot be due 10 con'dilian C- 

Tte majl w ), Di ^4 that he.; 3LLe* i,. 

t. The man who, he.; aaid that Maryi likes i H . 

C. The man wlio, Muyi [said that din lii™ ti] and wLu, te-, did [er] too 


LP FiRtr (IfliflS eikibtLi a class oE human lU(Ukf«s with a "switch^ reference" system, 
where the teiaLioas d ! carefertnee SJi4 obviation between subjects are a redly exprweed tn. 
the phonetic term at an uttoraJl-M, by distinct morphemes. 



We discuss tie details of our representation 


B.3 Necessary structure of ail explicit theory 

Id order to represent the elliptical structures in (he T-model with (he stan¬ 
dard binding theory and using ccindexing., we would need to postulat® an 
S-structure to LF mapping with following stK properties. 

1. The mapping must include a copy operation capable of copying Lhe 
entire evert structure to the posit ion of the null structure, even when 
the null structure is in a different sentence in the discourse, a flot- 
insign!fk-aut revision of the T-model, which is {was?) a theory of 
sentence grammar, 

^ This C&Py operation ftmst be able to replace an anaphoric clement o 
with a variable that sometimes inherits the agreement features of a, 
as evinced by examples (51) above, 

3- In order to account for the invariant interpretation of an anaphoric 
clement nr, the copy operation must be aisle to sometimes copy the 
referential index -of o, 

4, Tn order to account for the covariant interpretation of Cr without over- 
gefittrating as in (S3,, 6-1), the copy operation must he able to assign 
the copied a the referential index nf the argument that corresponds to 
i? when it does not copy the referential index of /j), 

5 - In order tn account for the lack of invisible condition A or nonlocal 
condition C effects, we must coniine binding conditions A and C to 

l] Tht iepreseatatiOii tm anaphora we prep™ is nuiilat to (hat of dumuiy flflBff), w Ltfc 
tb,E crucial difference tliar out obviation ie a relation foetvreen pijudonjc piTMC-edoctarn 
obviation may be nonlocal and is alway* sverU while tticmitjc-obviaLlDJi ei Jocal and maj- 
he invisible, Chomsky's ‘‘a eta phone indices’ are tdallona anor.f. Arguments anrt Ktrirt 
would not work f-jj thn preceding exam pies of invisible pbviatkm. 

jin explicit relation o[ obviation La independently motivated on conceptual pnnds. 
ElcprCifucHiionB hire a degree nf pertti nnence beyond conditions on those Mpwtntatipii*; 
conditions aaould not apply beyond lire ’creation 11 of those representation*, Obviation 
oetneen twsr positions is a condition th** Timet be satisfied in (he onnantic iptecpMiatwm 
*f a linguistic lepreMuUtioo jn the contnit of (Ire diaconue, and therefore a a relation of 
il* own. not merely a pnecondltkm on the conslrijctiKHi b lyitulic lolMiou of coTefBiwitCe 
Of Mltatedencc. 


141 



S'Stiw.turs. 


ft, In order to account for the 10 C, condition B most be enforced at 
IF, and R.-expreSS.]QElS mast be replaced with pronouns a'. S-strut lore, 
but only when the LF copy of nil Et-expression is C-commanded by a 
corefcrenUaE argument in its minimal governing category. 

Let us therefore consider Ml rdleinate approach. 

We have seer, that the overt and null structures are symmetric with respect 
to certain relations of referential dependency {antecedence, local obviation) 
while being asymmetric with respect to conditions on those relations (agree¬ 
ment, binding condition. A)- That is, the conditions are .strictly enforced in 
the overt Structure, hut blithely ignored in the null structure. This strongly 
suggests that there is really only one underlying representation, that the 
overt and null structures correspond to the same underlying thematic func¬ 
tion. 

A more elegant representation for ellipsis, then, is to segregate phrase struc¬ 
ture from, thematic structure Mid posit a relation of local obviation that 
holds between thematic-positions. Then we would simply say that the overt 
and null structures share the ftMH.fi thematic structure and hence they share 
the same relations of Sinking and Local obviation. Invariant and cOVljia.nl 
interpretations are accounted for by linking at th* levels of phrase- and 
thematic-structure, respectively. The IOC is accounted for by defining con¬ 
dition B in terms of nonanaphors. as the obviation of thematic-positions. 

Let us now make our representalions explicit, 


D.4 The proposed system of representation 

B.4,1 Segregate phras-e and thematic structure 

We propose to segregate phrase structure from thematic structure as follows. 
Phrase structure consists of a set of labeled points {syntactic constituents) 
and the familiar relations of immediate domination, precedence, and so forth. 

Thematic structure consists of arguments, functions from thematically-Lypwf 
argument positions to functions and to arguments, and the relations de¬ 
fined on those objects, including relations of thematic discharge beiweon 




argument positions and arc laments (thela-maTklEIg), as well as relations ea- 
linly between argument positions (the Ui-binding and theta-irlf*tlr.ificalion). 1 1 
Arguments are idsnttiled: by integral indices; functions art- identified by 
integral subscripts on the generalized function symbol /. For example, 
”/io(l ! ralel.,2 : rola2J T! identifies fie particular function a func¬ 

tion Of two themideally-typed arguments. For clarity, only the theta- 
position that is currently being discharged is depicted, as in “/m(2 : rolai'r 
ll:e tlirta-position 2 of tiieta-type rolo2 is being discharged. By con¬ 
vention, the (curried) arguments of a given function aie assigned successive 
indices starting with 1, 

Motivated by Maruutz (1990). [ propose the malic-functions of order 3. That 
is, a VO function /f() in combination wjtll its most affected internal argu- 
Jnents (such as inalienable possessor; theme/patient, instrument., or affected 
object Locative) returns a new verb-stem thematic function />(-) (MuadIe'e 
“ event 1"). The verb-stem function //(■) combines with the next most 
affected internal arguments (such as benefactjve, directional locative-, or 
alienable possessor) to return a new VP thematic function _f?(l ; tonsa,2), 
whirh is a function from an E0 tense and an external argument (typically 
an actor J to an entirely saturated function (an argument of thematic-type 
proposition 1 ). 11 This will play a role ig nty account of why a verb plus the 
benefactor is never the target of ellipsis, see section. B.5.1. 

E lie relation between phrase and thematic structures is represented as the 
pairwise connection between elements in the structures. For example, (107) 
• 5 assigned Lhe structure described in j108-110), where we have suppressed 

h.s mentioned Ln chapter 4, a thematic {billon must Inherit the a^ictmnni features 
af the- argument Ihal saLoratrs it. for Him; argument* The jet of prone*ms to which this, 
process applies varies tram. spr-jJcct U> apsaknr. 

T.-.e notation uaecl lor Lhfrna.ti.-c Structure (fo., foacliOBA o£ variant ntdcra) does apt 
m»tt^ formally bnrj,*w all choices arc fc-tlrtally equivalent. Anything done l>y *ne can 
h* done by the Other with imbigaity (owctliyt). Manat* 0^39) dietilrgulihe* two lepre. 
seat at ions of thematic fii RClfojiH. [□ cue, the verb as a 0- order function a-f ail argmueliLs 
contained in the proposition (iruernil ergumnuts,. event, and external argument). This 
approach makea use of H thrmalic-.grida H (cf., Stijwelt 100] Levin apd iteypapoit 
HiSginboth wrt 1085:1939). Mw*JlLi contrasts this will, the use of highet ordei functions, 
which have the advantage of naturally representing the fet that Less afFcctad argument 
we aastgoed n cojapositianal thcti-soLe, resulting from a ve-ft in comhtnati&n nith it* 
more afftvLed internal Ug aments (cf., Chomsky ]90]; Maran.tr. J m *) To more naturally 
rapture compositional themaor-roie itssigomnnt, ] use higher.aider fupetfons. Another 
reason for using highe-T-ord-er tbetneUc-funetfoTiii is- to define a c-eommitirl relation *n 
IhemaOc-jHMitioria, vdiich will gCCitiy nmplifjr the state me at of the hinding theorv. 


143 



mmv .important d-etalIs not relevant 3d this context. 
(107) Felix hates his- neighbors. 


The possessive- morpheme f's] a function of two arguments, the posses¬ 
sor and the possession. (ie. 7 the possessed thing)- First the pronoun fte is 
theta-marked with the tbet 4 -p«ition /?q[ 1). Then /s:{2) iheta-blnds Jrall]-, 
rcauJttSflg in argument 4 (the dosed KP hes ncrp^Aors;'. 

f loS ) [>-php[Ntihe, 3] M’s] - /an( J J owner)]. / ?Q ('2 : ps>Ssuasion)] 

[no neighbors, / M (l: object)], 4] 

X«ft. the V0 function (1 ; gbjtct) of the verb Aafe thcta-m&fks argument 
l, znc returns the VP function /fo('): 

(109) ( VP (voltaic, /icfl : abject)] [his neighbors,. 4], 

2 events : actor)] 

The phrase-thematic structure detailed in (10S) IS italicized and summa¬ 
rized in (109), Finally, /f c {l) theta-marks argument 2 {past tense 10), and. 
/f a (2) theta-marks argument 1 (the subject fWii), resulting in argument 5 
{a propwi tlon); 

tUQ) [tp[>rpFeliK, lj 

;i]i]u[-past], 2] [vptaJe ftts neighbera, jf 0 (l : errant)], 

H aP S actor)], 5] 


B. 4,2 Two relations of referential -dependency 

We further propose two relations of referential dependency, linh and obviate, 
defined on constituents (phrase structure points) and on theta-positions (the 
argument positions of thematic functions). The finking of theta*positions 
is favored over the colndexing of theta-positions by the same reasons that 
favor [inking over coindmting in the phrase- struct ure, ] s It is needed to 

lt Ttierc are arguments both ways: w-halerer aruunmiU can be mad-e far iLB&dn r.. can 
a ]mi he made IV: i ihr-ia-jinkiiig. J.-aantk ( 190 - 8 ) arjUtB for b relation nf ccdndexdTig fsn the 
bails af cacriLtLcji C cffecto, Balding condition C is semewhaL mysrenous in a finking 
(IlWrJ'i because R-eipiesatons n«<i, never have antecedent, yet tUUSt li*ve links to induct 
condition G effects. 


144 



reveal more information about sp]]t antecedents and the iiLcerpTutuiEOEi of 
direct./indirect, antecedence. For expository convenience„ we write “ft theta- 
linkE to 0 * when iwe mean that IL tJici theta-position assigned to ft links to 
the tlnet Eo position assigned to 0 ." 

W's significantly simplify the binding theory by representing alt arguments 
■Of the verb in Oi)C order-!? thematic-function. This reformulation of the 
binding theory does not solve many well-known problems for the standard 
formulation, such as the permissibility oF J#hn, bas himselfj far him, tv 
blame if. 

Let the argument a be assigned the theta-position, /,(j) and be .governed by 
the <-Commanding function /*(■}. Then condition A requires a Link between 
f t {j) and some /t(m) for [-t-aoaphor] ft, CoEidition Fi states that fj(j) obvj- 
ates all /^(rn) for [-anaphor] n. [Recall that condition B must he stated in 
terms of [-anaphor], rather than the widely assumed [+pronominal], in order 
to obtain the EOC effects For pronouns as well as Et-expitsakras.) When ft is 
controlled PRO, then it is obligatorily theta-Jj liked to its controller, always 
resulting Ln the eovariant interpretation. All such referential^-dependent ft 
must be linked or theta-linked to some 0 at S-striLCturs, subject to these 
binding conditions. Binding condition C requires that an R-expression nb- 
viatc all c-commaridlng phrase structure positions (in the domain of the 
head of its. chain, if strong crossover is Tedt|red to condition C effects for 
variables). 

For example, o4jviate(/? 0 ( 1),/S,(2)) holds in example (108410}, and when 
the antecedent of htstS Fe&r, then eitheT |ink(3,l) or link(/io(l) N /J 0 (2)). VVe 
leave unanswered the question of whether obviate(/ lfr (l] f / w (2)) should be 
included in this list of referential dependencies, as it se^ms Li should, be, 

Following Higginbotham (198&5734), Irnkfcr, ■?) is interpreted to mean In¬ 
clude some values of /J Ln the values of q/ while ahyiatefo.^) is interpreted 
to mean 'a and 0 CMlnOt share any values ill the structure in which they 
o«Ur. 1lS 

1’he necessity of explicitly representing Lhe invariant Interpretation using 
One (shared.) Structure prevents us from making our conference relation 
entirely between thematic positions,. Alternately, the necessity of explic- 

], Thjp da« n nt quit? wnri: [ar cases split antecedent::. wher* lie v Er b Bftftjrtt to buL-e a 
sjguLfi cant tfFecl on whether uiduouh DbviatuKn is enforced. For cximpl-e, [ ][ J ,- L | j ; . .f... r_ 

(tllb} only (tightly degraded, «ttd (lilt) entLidjr unuxep table; yet all ur excluded by 
cunditien B. 


145 



illy representing ail coiefareuce relations at S-Structure, including covarLanl 
interpretations, prevents us from making antecedence a relation entirely 
between phTase structure positions. Therefore, both instances dl tire l-nk 
relation art needed in the theory in order to explicitly represent a per¬ 
ceivable distinction between invari uut^covariant interpretations. The use 
of Olio Linking relation defined on two different types of points should not. 
raise objections on the grounds of parsimony when both are necessary, as 
well as independently motivated. The linking of phrase-structure positions 
is motivated by Higginbotham (1$83) for anaphoric antecedence, while the 
Linking of thematic-positions is motivated by Higginbotham (19S5U9&9) as 
a primitive semantic operation. The mathematician should not find the the 
generalization, of linking to include thematic portions objectionable, be¬ 
cause it has little effect 0 :l the computational and generative power of eke 
theory, (lit fact, il is a straightforward proof that representing ellipsis is in 
the complexity class A' r P, a theorem that is not so obvious when a copying 
operation is used instead.) 


B.4.3 Ellipsis as a shared thematic-function 

An elliptical structure, then, is a proposition P f that contains a thematic 
fiintt-km fi(') that is not phonologitally realized, leather, it is borrowed 
fretn a. proposition P that appears earlier in the discourse, whore /,(■) is the 
result of combining an overt verb with some of its arguments in P. 

The borrowed thematic function must be composed in the same way, as 
shown in (112) for a raising verb, and in (113) for passive. 



b. 


A man [arrived with his mother) and a woman did [e] too. 

* There [arrived a man with his mother] and a woman did [c] 
too. 


t 11J k. Jobm suggested to Bill? th»4 h=i sKddi 

b. ? Havaire; suggESted. (e Benedict* that pETsuade !ktsn n j| ta abjure 
SEJ1SU&1 pIlEBBUrES 

V, * Jul'.sii to RLILj that ho 3 tell th.Em 1t jj, m h-sue 


1 « 



a A Jtate college [granted Charlie a degree] and a private college 
did [e| too 

Li. " Charlie [wsm grunted a degree] and a private college did [e] too 


According to the principle of full interpretation, a logical constant is lice-nsked 
Wily if it &ati]rates a thematic-position, whereas a Logical operator is licensed 
only if it binds a logical variable that saturates a thematlc-p od t ion. Con¬ 
sequently, a logical variable is licensed only [{ it both saturates a thematic- 
position and i$ bound by an operator. An elliptical structure is licensed only 
jf all overt dements that it contains arc licensed. Therefore, the dements 
!JL An eLLipticaj; structure are subject to two constraints: (j) each logical 
argument (constant or variable) must be assigned a thematic-position by 
the shared thematic function; (ii) each logical operator with scope over the 
shaped thematic function fi{-) must bind a thematic-position that is free 
^ fii'J* The donirt-atronal semantics of an elliptical structure is given by 
substitution. 

The principle of full interpretation, then, establishes a element--wjst? bijection 
beewcuEL some elements in P and some in F J . An element a £ P and an 
dement a' £ P 1 are said to airrespond iff (5) both art logical arguments tltat 
saturate the same thematic-position fi[j% for some jr or(ij) hoth are logical 
Operators that bind the same thematic-position for some f s (k) free in 


Argument correa pnucicncc 


Let us now examine, in detail, our representations for the central case of VP- 
dlipsifi. VP-ellipsdi results in correspondences between, two It) arguments h 
aad between two external arguments, The example (114a) of VT-dlipfils 
is assigned the partial representation (lidb-d). The shared AT thematic 
function j%{') appeam in (114c). 17 

F&* clarity, ( liB-ve Mil represe-ntfiU Llic (litm-itH: aEiuclitra cf the COuidi Doling ccm jujic. 
rton- H??■•*,eTf-r, (,-leisr that rtiocdinatett! 4er higbEi eidEi tuncticraa, from ijBijaBnce 

nJ th-ctl'iiVic /uneEjwn /((-), .... /,(■) of idtntkal i-Lructuic Lo a new JuJicllDn f.+ , (-1, also 
witii the J 4 HSE HtmetOTe- This, In aoyeTent, U, the intuition ij iuLerlyirtg “Btcaja-thE-beittP 

(OkElIUBll. 


147 





(11 V Ftlix ha'.es his neigb^nrs- and Max dots too. 

b. [mp (i>:F 3] [hio[ S J > /j&(l ' owno-r}}, fa{2 : pcB&eoEiOtt.i] 

[tjijnejuhbors, /mO : object)), 4] 

c. [vrp'volLite, /? a (L : object)) [ftis neuters, 4], 

/P 0 [l : event, 2 : actor)] 

a* [[iHnpFgHx, i] 

(] l [sj [-past] ■ 2) [vpftflie his neighbors, /J^(l : event)], 

O a f Kt*r)}> 5] 

[and [np[NpMajCi 6] 

[n[»[-p“i 7) |J%(1 ; event)], 

fU 2 - aetor )] 1 *]* & 1 

.4 perceivable ambiguity arises when j3 :s th* external argument of a VP lil a 
VFE structure. In this CM* P St link fives rise to the invariant interpretation, 
while a theta-link yields the HW&fi&nt interpretation. (In other caies, link 
and theta-link can both result in an invariant interpretation.) Thus, the 
partial representation in (114} may he completed tn one of two ways; 


(U5 i ohuiilef^l),/^}), lrnk( 3 ,f) 

b. obviat«( ff 0 ( 1),/£> (2)), !i nk {fa (1) )) 

The link in ! I ]. r >a) gives the in variant interpretation, white the theta-linh in 
(ilSh) gives the covajjant interpretation. (Recall that / 2 o(l) IS MBijpiecI to 
argument 3, the pronoun Ac,) 


Operator correspondence 

We jnst saw how the outer arguments of an elliptical structure can share the 
thematic function created in a distinct proposition by a verb in combination 
with its inner arguments. TEie proposed, system allows another possibility, 
where an operator with scope over a thematic function shares that thematic 
function with a corresponding operator at LF. That is, if a is an operator 
that binds a variable c in an overt proposition /^and a' is the corresponding 
opetitOf with scope over an el.li.psed proposition P\ then a* will he inter¬ 
preted as if it also binds the variable e LJt F*- Recall that logical constants 
may become operators as the result of a focusing process. Focus may be 
reliably Correlated with, stress, at Ee&St in English. 


148 


Ttlis solves the difficulty discussed above in Che nontext of the Sag- Williams 
proposal, which is the problematic cjVscs of nonsubject covariance and CF' 
ellipsis. Observe that when the corresponding arguments receive parade] 
phonologic^ focus, then both may he antecedents of the cova riant pronoun 
in (110a), However, when only one argument is focused, as in (1 ] 0h), or 
when the focus is applied elsewhere (116c}, then the covariant interpretation 
ia no longer available. (In fact, (] if}c.) can only mean that “the bursar paid 
the student." never “the bursar paid the student expenses."’) 

a, The bursar paid the PliOFESiiOH^ [hiS] expenses| and the 
STUDENTj too. 

b, ? The bursar paid the PRGFESSOEli [hi$i expenses] and the 
student 2 too. 

C. * The BURSAR paid the professor! [biS] CXpensesJ and the 
indents loo. 

(liven these facts, the explanation of nonsubject cov&tiiiaice would seem to 
lie not in reducing nonsuhjects to subjects, but in reducing corresponding 
antecedents to corresponding logics*] Operators. That is, focused elements are 
assigned Scope at LF, and when the corresponding thematic arguments of a 
Coordinate structure are focused, then the covarianC interpretation becomes 
possible. The LF representation assigned to the CO variant interpretation of 
fl 16a) would involve- proposition-ellipsis and look something like (llr); 

■ IlkPOCl'SJ riw? professor]; ;fhe bursar paid c, his, expenses] t ] and 
[[l+FOCUg the student], c](] 

i he examples of US’ -ellipsis would be assigned a similar structure, where the 
focused actor and focused benefactor become logics] operators with scope 
over the shared proposition that he thought nhe. lefts clrun£, perhaps as in 
(118) for the example (76). 

(Iffly [John,- [Bettyj Lold Cj [that e, thought c-j was drunkJjjj) and 
lOrville; [Naomij [c,- told e. ; [c]j]]J too 

The Covariant interpretation af the anaphoric elements in the shared propo¬ 
sition is enhanced by the "parallelism cue 11 loo, as wed av by the equivalence 
between matrix verba. 


The possibility of operator carrespomlence resolves id open. problem. namely 
&1 lc fact thri an anaphoric element may receive a cpvsuiant interpretation 
even when Lis antecedent is not an argument of the shared thematic function, 
provided that the Corresponding antecedents are focused, as in (115), or they 
arc inherently logical operators, as in (126). 


1,1 l4 i TO Ml [said that Sue [kissed him L j before BILL 3 asked Mary to 

Ml. 

(jfcisf hirn-iy-j]) 

b, TOMi [wanted Sue to kiss htm^'i before BILLj asked Mary to 

Ml- 

([*«■ 


^ 2 °ti. Which man | [said that Sue [kissed tjrn i) before which boy 2 
asked Mary to [c]^. 

([^IBS hlTTZ] ^j]) 

b. Every Qiatl] Wanted Sue to [hiss him]] before some boy 3 sacked 
Mary to (c)]. 

([ih'ss fttmi/s]) 

(The relevant interpretation is the one where the adjuncts are understood 
as being associated with the higher verb, Le,, befart-] 

An interesting property of the example (151) is that a vgrb with its beue- 
factive argument, Le,, in/yrn Afwr-y, is in some sense equivalent to a verb 
without any of its internal arguments, ie,, say. 

(151) TOMi said [that hue kissed hlini and BILL 2 informed Harry^ 

N 

([(Aaf Sue kissed /lim^j^j]) 

We know on independent grounds that the benefactor does not saturate the 
first, argument position of Lhe verb; rather it is an argument of the verb plus 
ils themc/patient (Marintz, L9M). That is. the complex predicate analysis 
of double object constructions states that inform., inform S', and inform NP S 
correspond to possible thematic. function e, whereas inform NP cannot. The 
covariant interpretation of example (121} cannot be accounted for as theta- 
linfcjpg, given the complex predicale analysis suggested in section B.1,2, but 


150 


may be accounted. for straightforwardly as operator correspondence tn the 
analysis proposed in this section, 

The central conceptual problem with the proposed system is one of paisi 
iriony. Tb put things in the worst possible light, [inking car. result in. an 
invariant interpretation w ^ eit the ifttetedent is a lo^i c.faJ constant, or in a 
covariant interpretation when the antecedent is logical Operator. Likewise, 
thcta-linktng can result 3ft a cnvariapt interpretation, or an invariant inter¬ 
pretation when the antecedent is a logical constant outside the domain of the 
shared thematic-function. 1 bus,, it would seem that theta-linking is entirely 
unnecessary, that we can always account for Ike covariant interpretation as 
Linking to a logical Operator. 

Howevei, this is not the case. What needs to be explained is- the complex 
interaction among (i) the phonology (stress/unstressed antecedents} h (Li) the 
logical type of the antecedent (operator/argument)., (jlj) tlm domain of the 
shared thematic-function Oocludes/ardudes antecedent), and (iv) the in¬ 
variant and covariant interpretations. L inki ng, thetn-lifthingj and argument 
and operator correspondence arc all needed in Order to account for the ton- 
ptev array of facts we have seen so far. The next section presents additional 
evidence. 


Bh 5 The space of elliptical structures 

In this section, we- exercise the proposed system, First, we enumerate the 
empirical consequences of our der.isioo to use thematic functions of ordcr-3. 
Next, we show how the proposed system accounts for the tricky cases of 
invisible crossover, invisible obviation, and recursive ellipsis. 

B-5rI Possible domains of ellipsis 

Given 0u.r decision to use thematic functions of order3-, We predict that the 
VO function, the verb-stem function, the VF function, and the saturated 
proposition function are the only possible targets for ellipsis. Let us briefly 
consider each In turn. 

True VO-ellipsis is only possible for verbs of only one argument, the external 
argument, as in (132), 


151 


' Hill [left" and John, did [e] too 

h. Bill wanted to [Isove] and John wanted to [c] too 

(It Ls of coarse difficult to distinguish VQ-dtllpsis. from verb Stem- or VF 
ellipsis in these cates.) Giber examples of VO ellipsis, snch as {12-3), aie 
excluded by independent principles oF the grammar, perhaps the case filter 
applied to internal irguments. 

^ 123 2 * John [saw] Mary and BILL did [cl Kate (too). 

b, * John [^ave] Mary boohs and BLL3 did [el Kate records.. 

Gapping *tractora> as in (124a), cannot involve true VO ellipsis, because 
both, the verb and its 10 tense argument are gapped. This means we cannot 
employ a theta-linking analysis. Rather, we must follow Pcsetsky (19S2) in 
analysing gapping as LF argument, raising, perhaps by the mechanism of 
focus, combined with proposition-ellipsis,, as sketched in (124b). That 63, 
gapping is analysed as the correspondence oflogic&L operators. 


John saw Mary and Bill, Kate, 
b, [John] [hlaryj [(l saw f-j)g.j) and (Bllli [Kates [is] 3 ]] 

Verb-Stem ellipsis is exemplified in (125) for the v'f-rb plus theme/patient. 


(12S) John [donated money] to the Red Cross and Bill did [ejj to the 
Boy Scouts. 

Again, the verb plus directional locative cannot be ellipsed without violat¬ 
ing cite crute Alter- However LF argument raising may be combined with 
proposition ellipsis, as shown in (126) t#create the appearance of verb-stem 
ellipsis. 


(126) 


a. John took a bus to New York Wtd Tom, a plane. 

b. [Johfli [a bus? {? i took to New York]j] and [Tum| f a plane? 

m 


The verb pins benefattEve cannot be eiLipsed unless the theme/patient is 
also eliipsed (127). 


( 127 ) 

S. * JoIlil [donated] money [to ttl-e Red Cross] and Rill did [e] tint?:, 
b. * John [gave the Red Cross] money and Rill did (c) time. 

For can the verb p us directional locative be ellipsed unless the instrument 
is also ellipsed (] 2.?}. 

(12&) * John [look a plane] to NiW York and Tom did [e] to Los 
Angeles, 

It is possible, however, to ellipse the veih plus bnefacliue when the verb 
3s pasajviMdj raising the theme (12&&), When the It) morpheme is missing, 
a? in (12&b)* the utterance must be analyzed as, operator correspondence 
combined with proposition ellipsis, 

a, Money was [given (to) the Red Oiqjs] arid time woe [e| too. 
b- Money (was given the Red Cross:] and time [e] too. 

Again, the gapping structures in (130) must be distinguished from true 
ellipsis of the verb plus beuefactjve or verb plus directional locative* which 
are both ittipossible, 

John donated money to the Red Cross and Bill, lime. 

b. John look a plane to ^ew York and Tom, to Los. Angeles. 

Proposition ellipsis is exemplified by gapping structures* and by the exam¬ 
ples of Donsubjttt covariance, as discussed above in section E.4,3. 


13 >5+2 liivisible crossover 

Recall that Higginbotham's (1985) theta- bin ding relation of themauc-di sen arge 
holds between a delernuner and the open thetas position associated with 
nominal?. Chomsky ; 19>2j suggests Ibat. the imprwsi-lin-ly of iterating de¬ 
terminers may be related to a prohibition against vacuous quantification. 
Tills motivates Higginbotham to equate theta-bmding to the quantiflcation 
of theta-positions* in order to block iterated determiners. The theta-link 
relation of referential dependency proposed here and. Higginbotham's Iheta- 
blndiiLg relation are sufficiently similar to suggest that theta-binding reduces 


to thut it linking, perhaps under the stricter locality constramts applying, 
generally to all relations of themali c ■ dl*charge, If this is so, as certainly 
seems plausible, then our system equates the covariant interpretation of an 
anaphoric element to the quantification, and “invisible crossover 11 effects 
should accrue in VPE. One such example—first noted by Dahl (1972;J,97>H 
and first explained by Sag (lG7flrl37) as crossover violation—is (131&), whose 
null structure has the three interpretations shown in (l3lb-c). 


' 13 V. &illi [believed that he 3 loved hls L wife] and llarry-j did [c] too. 
h. ([5efieiW fth^z? ke^ ici'c-d his 2 f 2 
d, {| Deleted Ihol ^C| tetrad /+rf]/.-j r ^M) 


In order to obtain interpretation (l3lb), he must he theta-linked 1,o Bili, 
while 7 jht may ho either linked or theta linked to mil in the overt structure, 
resulting; in an ambiguity tn the nuL! structure. However, in order to obtain 
interpretation (131t), for must ho linked to fftffin the overt structure. Mow 
in order to obtain the covanant interpretation of his in the null sLructuie, 
Ms must ho tbcta^linked to the external argument of the matrix VT (Bill) 
in the overt structure. But this is excluded as a crossover configuration in 
the first conjunct, of (131a): the theta-position assigned to £jff| ’'quantifies" 
the theta-position assigned to htS], crossing over a pronoun [ h s L j that hi- : . 
linked to tku argument 


D^3 Invisible obviation 

Invisible obviation structures are illustrated in the representation (134) of 
Our previous example (9l). Tke verb utarti corresponds to a function of one 
internal argument, 1 . pTop-osipiofl), 114 Tho shared thematic function 
/m('} appears tn (134b), 

l '*Tbe bare VP eaample*. in (1321 *tiw«t that the FP headed by te/ere should be viewed 
an an adjunct to the VF, rather Lb an u an argument of Ih.e JD event or an Che innctTiS&ei 
Argument. e>£ the Vp- lun-ntjon. u»me hare rv. 5 g.c 5 l.ed (cf lamn 14ft? fn.it). 

[[R an six ji'.iJes - before lunch]] is vh aL Kandy did. 
k [Do thatj.r.j, after eating liTenkf.ut| in vt hat tied did. 
c. [Do tji at] f mi ] it what Rod did slier eHin* breaki***- 

Thin b becjniae Lbe antecedent of the J.bnr pronoun, whatever it e, is sot undCKtedd 
as indijdiiix the FP. tit jilt Lie thematic function carrespemdjug to before sshpijld he 


154 



■a- Rottmoj wanted ftoaajjne to [love him] before wanting himself 
to fe] 

b. [vip[v4>lQve t /|,{1 : object)] 5], 

J%(1 : ■ actor)) 

0- [vptvQWHit, /jofi : proposition!] 

[ip[np Rosaline, 3) 

h(jo[-^&h+to] n ■*] [vp&M firm, Jfo(l : event }| h 
f$ 0 (2 : eeteis)], 6], 

I - ***nt*2 ■ actor)] 

d. [cpf^PRO, 7j 

[]j[]ft[-t.iis t +li(if] J 8) 

[vp(volant, f^n( 1 : proposition)] 

(if[np himself, # |[jo[-bn5,+ to], 10] [/£,(! : event)], 

/aoC 2 ; actor)], 11], 

: event)], 

^o(2 : a«ot)] h 12] 

e. [[p]np Romeo, 1] 

|ll|lo[+pMt], 2] 

[vp[vpu«Fil Rosaline, to toot him. /^ 0 (1 : event)) 

[pp before [ipPJiO want+ing himself to, 12]| 

/? D (1 : event)], 

/?q( 2 : actor)], 13] 

The Associated referenxio] dependencies are: 


(135 i obviltf{/*,(!),l«iM/?*C*M£rf2)h linfej(^t2),/fo(2)) 
b- obviate,;/l ( 1; [2) ], link L (9,7) p ikk 2 (^(2),/f 0 (2 )) 

fiv\l ■ frtat. i : fipenv), and in l be itriLcEvie (334), fail) eitbex tbsttnuki ox Lhrla- 
Ldentili-r-E- ud /iq(£) ii (-JieLi-idenLLRed -with f{-_yV\. Th« name argument may be 

applied io maanci adv«rtqaJ:i, which taJaca difficulties fsi any tlieoiy tba.L axiilyx-rj th-cm 
os the inuernmt argument of llm YHjrJxil filitCOaji: 


tLJJ i 

b. 


l)E.u.n si* Trtil«): qlii-Lklyp ia what Kandy did- 
[Uq eIi.il-j .,2 alowly] is whii Itiid did 
)Ud ia. yyhat. ftod did ipowly. 



The relation of O-bvjalion is between Aim and Rosaline. (the other obviation 
relation. obt , iate{/j{ > (2),/[g(2)), between Romiint and has been sup¬ 

pressed for clarity). link] is from the anapboi himself to controlled PRO: 
link? is from controlled PRO to JJtirrifw. By the semantics of the link rein 
tier, both arguments fl and 7 must include the value of argument 1 in their 
values. By the semantics of the obviate relation, none of arEuments assigned 
Au ) (that is, argumenl 5) may share a vadue with any of ihe arguments 
assigned /^,(2) (that is, arguments 3 and 9), 

If we enforced coreference between ftim and /iomco, we woutd have to add a 
linking relation weather {135a) or (ISSbJ, either link(.5,l) or link{/g u (l),/Ja.(2)), 
The effect of adding either link is to include the values of argument 1 in t he 
values of argument 5; hilt then the arguments f) and 5 share the values of 
argument 1, w-hith is expressly forbid by the ob¥iile(^j(I) T /|(f(2)) relation. 


n,5,4 Recursive ellipsis 

The last of the tricky cases are the examples of recursive ellipsis. The 
example (35) front chapter 4. reproduced here as (136), has the three possible 
interpretations paraphrased in (l37), 



J&cki [[corrected hi?] spelling mistakes], before the teacher^ did 
|e|i)j and Tedj did (too. 


(137) 

a. 


b. 


c. 


Jaclfi corrected Iusl spoiling mistakes before l be teacher^ cor¬ 
rected his) spelling mistakes and Tedj corrected hisj spelling 
mistakes hefope the teacher^ corrected hi&] spelling mistakes. 
Jack] corrected hisj spelling; mistakes before the tO&chOTi Cor¬ 
rected his? spelling mistake?; and Tedj corrected his^ spelling 
mistakes before the teachers corrected hisi spelling mistakes. 
Jack] corrected JiiS] spelling mistakes before the teacherj cor¬ 
rected hiSi spelling mistakes and Teds corrected hisj spelling 
mistakes "before the toacherj corrected hisj spelling mdstskeE. 


The purely invariant and covarrant interpretations (137a,b respectively) 
present no problems for the standard i dent i£y-of-predication analysis. Thev¬ 
ars analyzed in the current proposal as linking and theta^ Linking. The 


15fi 


mixed! interpretation (137c), which has been considered indofliit able evi¬ 
dence against i 4 CDti %y -of-p tt dication and for a copying operation, is also 
Straightforwardly analyzed in the current proposal as linking with operator 
correspondence between ,/ficfcand Ted, after argument raising. This analysis 
ss confirmed by the fact that both /wtft and Usd liulSt be heavily stressed 
before the interpret atioiL (I37c) b-eccuncK available, 


B.G Conclusion 

The central technical goal of this chapter has been to contribute to the de¬ 
velopment of a linguistic theory that is explicit, appropriate, and maximally- 
cons-trained. 

In Order to be appropriate and maximal ly-constrained, the theory should 
not use a copy operation. One natural consequence of copying is that the 
original Mid its- copy are independent with respect to subsequent processes. 
That iSj once the original is copied, a Subsequent processes will apply to both 
the original and its copy independently. But as we saw, the overt and null 
vpa are not truly independent in a. VPE structure, and therefore the copy 
operation is inappropriate and should be allowed into a restrictive theory 
only as a last resort. 

io order to be explicit, a linguistic, theory must represent all perceivable 
linguistic distinctions. As Chomsky (1965:4-5) observes, 4i a fully adequate 
grammar must assign to each of an infinite Tange of sentences a structural 
description indicating how this sentence Is understood by the ideal speaker- 
hearer. This is the traditional problem of descriptive linguistics. . . .* In 
general, distinct interpretations must correspond to distinct linguistic rep¬ 
resentations, In particular, ail referential dependencies—such as perceived 
coreference between ml anaphor or pronoun and its linguistic antecedent — 
roust be represented by an explicit syntactic relation of cotefercnce, even if 
the binding conditions are stated itt terms of obviation. To do otherwise will 
result in a leas than adequate theory, 


157 


